Meet your
thinking partner.

Tackle any big, bold, bewildering challenge with Hanzo AI.

Preview

Some tasks just work better on macOS

With Hanzo Dev, AI can now work directly with your local files and tools. Available in the macOS app.

Complete AI Engineering Toolkit

Everything you need to build, deploy, and manage production-grade AI applications

AI Model Hub

Access a catalog of production-grade foundation models from leading providers and Hanzo's specialized models.

AI Agents

Create autonomous agents that can reason, plan, and execute complex tasks with minimal human intervention.

Optimized Runtime

High-performance inference with automatic batching, caching, and efficient resource utilization.

Vector Database

Built-in vector storage for embeddings with automatic indexing and retrieval optimization.

Evaluation Suite

Comprehensive tools for testing, evaluating, and benchmarking AI models and applications.

AI Observability

Full visibility into AI system behavior with detailed metrics, logging, and performance analytics.

AI Safety & Guardrails

Advanced content filtering, privacy controls, and ethical guardrails for responsible AI deployment.

Enterprise Scale

Built for high-scale production workloads with auto-scaling, high availability, and global distribution.

Developer SDK

Intuitive libraries for Python, TypeScript, and other languages with comprehensive documentation.

Model Serving

Simplified deployment and management of custom models with automatic versioning and A/B testing.

Fine-tuning

User-friendly tools for customizing foundation models to your specific use cases and data.

Unified AI Platform

A complete suite of AI capabilities accessible through a single, consistent API with everything you need to build powerful AI applications

Model Hub Access

One API for thousands of models from top providers and the open-source community

Vector Database

Built-in vector storage with automatic indexing for semantic search and RAG applications

Document Processing

Process, chunk, and index documents in 30+ formats with automatic metadata extraction

Semantic Search

Natural language search across your knowledge base with advanced relevance tuning

Agent Framework

Build autonomous AI agents with reasoning, planning and tool-use capabilities

Code Generation

Specialized models for code completion, refactoring, and documentation

Workflow Orchestration

Chain AI operations with built-in caching, observability, and error handling

Usage Analytics

Comprehensive analytics and cost tracking across all AI operations

Security & Compliance

Enterprise-grade security with data residency options and compliance features

Chat Interfaces

Pre-built chat components with memory management and streaming responses

Prompt Management

Version, test, and optimize prompts across different models and environments

Optimized Edge Serving

Global edge deployment for ultra-low latency AI inference and responses

AI Engineering Platform

Build, deploy, and scale AI applications with an integrated suite of tools designed for modern engineering teams.

Hanzo AI Engineering Platform

AI Model Registry

zen4LLM
Provider: Hanzo
Accuracy
96%
Latency
85ms
GPT-5.3.3LLM
Provider: OpenAI
Accuracy
97%
Latency
120ms
ClaudeLLM
Provider: Anthropic
Accuracy
96%
Latency
140ms
Gemini 2.5LLM
Provider: Google
Accuracy
95%
Latency
110ms
Zen4 ProLLM
Provider: Hanzo
Accuracy
94%
Latency
80ms
MixtralLLM
Provider: Mistral
Accuracy
93%
Latency
85ms
Llama 4LLM
Provider: Meta
Accuracy
94%
Latency
90ms
Zen4 MiniLLM
Provider: Hanzo
Accuracy
95%
Latency
75ms
Stable DiffusionImage
Provider: Stability AI
Accuracy
92%
Latency
200ms
Cohere CommandLLM
Provider: Cohere
Accuracy
91%
Latency
95ms

Model Integration

Integrate with OpenAI, Hugging Face, and other machine learning platforms. Deploy and serve custom ML models with built-in scaling and monitoring.

  • One-click API connections to popular ML services
  • Simple deployment of custom models with containerization
  • Performance optimization for inference workloads

Vector Search

High-performance vector database capabilities for semantic search, RAG applications, and similarity matching across billions of vectors.

  • Advanced indexing for fast k-NN and ANN queries
  • Built-in embeddings generation from text and images
  • Hybrid search combining vector and traditional queries

Data Processing

Specialized data processing pipelines for cleaning, transforming, and enriching training data for machine learning models.

  • Automated ETL workflows for AI data preparation
  • Data versioning and lineage tracking
  • Scalable batch and stream processing

AI-Enhanced Features

Ready-to-use AI capabilities that can be integrated into applications with minimal configuration.

  • Content generation and summarization
  • Image and video analysis with computer vision
  • Real-time anomaly detection and predictive analytics

Unified Model Access

Access thousands of AI models through a single, unified API with consistent interfaces and predictable pricing

Hanzo Zen Models

41 foundation models across language, code, vision, image, audio, speech, and retrieval

Zen5

Next-generation agentic frontier model with native chain-of-thought.

via Hanzo

  • 1M+ context window
  • Agentic-trained
  • MoDE + CoT
  • Preview

Zen5 Pro

High-throughput agentic model for demanding production workloads.

via Hanzo

  • 512K context window
  • Agentic-trained
  • Production optimized
  • Preview

Zen5 Max

Maximum context agentic model for document-scale analysis.

via Hanzo

  • 2M context window
  • Extended CoT
  • Document-scale
  • Preview

Zen5 Ultra

Deepest reasoning model with multi-pass chain-of-thought.

via Hanzo

  • 1M context window
  • Deep CoT
  • Self-verification
  • Preview

Zen5 Mini

Efficient agentic model with zen5-class intelligence.

via Hanzo

  • 256K context window
  • Agentic-trained
  • Cost efficient
  • Preview

Zen4

Flagship model for complex reasoning and multi-domain tasks.

via Hanzo

  • 744B MoE (40B active)
  • 202K context window
  • Ultra Max tier
  • $3 / $9.60 per MTok

Zen4 Ultra

Maximum reasoning with extended chain-of-thought.

via Hanzo

  • 744B MoE (40B active) + CoT
  • 262K context window
  • Ultra Max tier
  • Deep reasoning

Zen4 Pro

High-capability model with efficient MoE architecture.

via Hanzo

  • 80B MoE (3B active)
  • 131K context window
  • Ultra tier
  • Efficient MoE

Zen4 Max

Most capable model for complex reasoning and agentic tasks.

via Hanzo

  • Dense architecture
  • 1M context window
  • Ultra Max tier
  • Agentic coding

Zen4.6

Extended context for long-document analysis and agentic workflows.

via Hanzo

  • Dense architecture
  • 1M context window
  • Ultra tier
  • Cost efficient

Zen4 Mini

Ultra-fast lightweight model, ideal for free tier.

via Hanzo

  • Dense architecture
  • 128K context window
  • Starter tier
  • Free tier

Zen4 Thinking

Dedicated reasoning with explicit chain-of-thought.

via Hanzo

  • 80B MoE (3B active) + CoT
  • 131K context window
  • Pro Max tier
  • Chain-of-thought

Zen4 Coder

Code-specialized MoE for generation, review, and debugging.

via Hanzo

  • 480B MoE (35B active)
  • 163K context window
  • Ultra tier
  • Code generation

Zen4 Coder Pro

Full-precision BF16 code model for complex codebases.

via Hanzo

  • 480B Dense BF16
  • 131K context window
  • Ultra Max tier
  • Full-precision

Zen4 Coder Flash

Lightweight code model for speed and inline completions.

via Hanzo

  • 30B MoE (3B active)
  • 262K context window
  • Pro Max tier
  • Fast completions

Zen3 Omni

Multimodal model supporting text, vision, audio, and structured output.

via Hanzo

  • ~200B Dense Multimodal
  • 202K context window
  • Pro Max tier
  • Text + Vision + Audio

Zen3 VL

Vision-language model for image understanding and visual reasoning.

via Hanzo

  • 30B MoE (3B active)
  • 262K context window
  • Pro Max tier
  • Image understanding

Zen3 Nano

Ultra-lightweight model for edge deployment.

via Hanzo

  • 8B Dense
  • 128K context window
  • Starter tier
  • Free tier

Zen3 Guard

Content safety classifier for moderation and guardrails.

via Hanzo

  • 4B Dense
  • 65K context window
  • Pro tier
  • 119 languages

Zen3 Image

Best general-purpose image generation.

via Hanzo

  • Diffusion
  • Text-to-image
  • Image editing
  • $0.04/image

Zen3 Image Max

Maximum quality image generation.

via Hanzo

  • Diffusion
  • Maximum quality
  • Professional creative
  • $0.08/image

Zen3 Image Dev

Development model for experimentation.

via Hanzo

  • Diffusion
  • Development
  • Iteration
  • $0.0005/step

Zen3 Image Fast

Fastest image model for real-time generation.

via Hanzo

  • Diffusion
  • Ultra-fast
  • Real-time
  • $0.00035/step

Zen3 Image SDXL

High-resolution image generation at 1024px.

via Hanzo

  • Diffusion
  • 1024px
  • High-resolution

Zen3 Image Playground

Aesthetic model for artistic generation.

via Hanzo

  • Diffusion
  • Aesthetic
  • Artistic

Zen3 Image SSD

Fastest diffusion model for real-time generation.

via Hanzo

  • 1B Diffusion
  • Fastest
  • Real-time

Zen3 Image JP

Japanese-specialized image generation.

via Hanzo

  • Diffusion
  • Japanese
  • Specialized

Zen3 Audio

Best quality speech-to-text transcription.

via Hanzo

  • 1.5B ASR
  • 100+ languages
  • Best accuracy

Zen3 Audio Fast

Fastest speech-to-text for high-throughput.

via Hanzo

  • 809M ASR
  • Fastest
  • Batch optimized

Zen3 ASR

Real-time streaming speech recognition.

via Hanzo

  • Streaming ASR
  • Real-time
  • Sub-500ms latency

Zen3 ASR v1

First-generation streaming ASR.

via Hanzo

  • Streaming ASR
  • Legacy
  • Compatible

Zen3 TTS

High-quality text-to-speech with natural prosody.

via Hanzo

  • 82M TTS
  • 40+ voices
  • 8 languages

Zen3 TTS HD

Maximum fidelity text-to-speech.

via Hanzo

  • TTS HD
  • Broadcast-grade
  • 48kHz output

Zen3 TTS Fast

Low-latency TTS for real-time voice agents.

via Hanzo

  • 82M TTS
  • Low latency
  • Voice agents

Zen3 Embedding

High-quality text embeddings for RAG and search.

via Hanzo

  • 3072 dimensions
  • 8K context window
  • Pro Max tier

Zen3 Embedding Medium

Balanced embedding model for retrieval.

via Hanzo

  • 4B parameters
  • 40K context window
  • Cost-effective

Zen3 Embedding Small

Lightweight embedding for high throughput.

via Hanzo

  • 0.6B parameters
  • 32K context window
  • High-throughput

Zen3 Embedding OpenAI

OpenAI-compatible embedding endpoint.

via Hanzo

  • 3072 dimensions
  • 8K context window
  • OpenAI compatible

Zen3 Reranker

High-quality reranker for RAG pipelines.

via Hanzo

  • 8B parameters
  • 40K context window
  • RAG accuracy

Zen3 Reranker Medium

Balanced reranker for retrieval.

via Hanzo

  • 4B parameters
  • 40K context window
  • Cost-effective

Zen3 Reranker Small

Lightweight reranker for high throughput.

via Hanzo

  • 0.6B parameters
  • 40K context window
  • Minimal cost

Third-Party Models

100+ industry-leading models available through the Hanzo AI Cloud gateway

Claude Opus 4.6

Anthropic's most powerful model for the hardest tasks.

via Anthropic

  • 1M context window
  • Most capable model
  • Complex reasoning
  • Extended thinking

Claude Sonnet 4.6

Ideal balance of capability and speed for production workloads.

via Anthropic

  • 1M context window
  • Best balance of speed and intelligence
  • Strong coding
  • Fast inference

Claude Haiku 4.5

Fastest and most affordable Claude model for high-throughput tasks.

via Anthropic

  • 200K context window
  • Fastest Anthropic model
  • Cost efficient
  • Low latency

GPT-5.3.3

OpenAI's flagship model with advanced reasoning capabilities.

via OpenAI

  • 400K context window
  • Multimodal
  • Advanced reasoning
  • Tool use

GPT-5.3.3 Mini

Cost-efficient OpenAI model for everyday tasks.

via OpenAI

  • 400K context window
  • Fast and affordable
  • Good quality
  • Low latency

Zen4 Ultra

Advanced reasoning model with extended chain-of-thought.

via Hanzo

  • 202K context window
  • Reasoning model
  • Chain-of-thought
  • Math and code

Zen4

Flagship general-purpose model with strong benchmarks.

via Hanzo

  • 202K context window
  • 744B MoE
  • Strong general performance
  • Open-weight

Gemini 3.1 Pro

Google's flagship with the longest context window.

via Google

  • 1M context window
  • Multimodal
  • Long-context reasoning
  • Code generation

Custom Models

Deploy and customize models to meet your specific needs

Fine-tuned Models

via Custom

  • Domain adaptation
  • Company knowledge base
  • Specialized tasks
  • Improved performance

Hugging Face Models

via Custom

  • Community models
  • Thousands of options
  • Specialized capabilities
  • Open source

Custom Embedding Models

via Custom

  • Domain-specific embeddings
  • Custom similarity metrics
  • Enhanced search
  • Optimized retrieval

Single API for Everything

Our unified API provides direct access to all AI capabilities through a consistent, developer-friendly interface

Model Routing

Smart routing to optimal models based on task, cost, and performance requirements

Document Processing

Built-in document parsing, chunking, and semantic analysis capabilities

Vector Search

Integrated vector database for semantic search and retrieval augmented generation

Knowledge Base

Create, manage and query custom knowledge bases for your AI applications

Versatile AI Use Cases

Hanzo's AI platform supports a wide range of intelligent applications across industries

Conversational AI

Build intelligent chatbots, virtual assistants, and customer support agents with natural language understanding.

Generative Content

Create text, images, code, and other content with AI-powered generation and customization.

Knowledge Retrieval

Implement semantic search, question answering, and information extraction from your data.

Autonomous Agents

Deploy AI agents that can perform complex tasks, make decisions, and execute workflows autonomously.

Developer Tooling

Enhance your development workflow with AI-powered code generation, debugging, and documentation.

Voice & Speech

Convert speech to text, text to speech, and analyze voice interactions with advanced AI models.

Simple Implementation

Build powerful AI applications with just a few lines of code using our intuitive SDK

import { Hanzo } from '@hanzo/ai';

// Initialize the Hanzo AI client
const hanzo = new Hanzo({
  apiKey: process.env.HANZO_API_KEY
});

// Create a conversation with memory
const conversation = hanzo.conversation({
  model: 'gpt-5.3',
  memory: true,
  system: 'You are a helpful assistant'
});

// Send a message and get a response
const response = await conversation.send('Tell me about AI engineering');

console.log(response);

Documentation Example

Vector Search
// Create a vector store
const vectorStore = hanzo.vectorStore('my-store');

// Add documents to the store
await vectorStore.addDocuments([
  { text: 'AI engineering best practices...' },
  { text: 'Deploying models to production...' }
]);

// Search for similar documents
const results = await vectorStore.search(
  'How to deploy AI models?', 
  { limit: 3 }
);
AI Agents
// Create an agent with tools
const agent = hanzo.agent({
  model: 'claude-opus-4-6',
  tools: [
    hanzo.tools.webSearch(),
    hanzo.tools.codeInterpreter(),
    vectorStore.asTool('knowledge')
  ]
});

// Run the agent with a task
const result = await agent.run(
  'Analyze our production metrics and suggest optimizations'
);

Trusted by Industry Leaders

Powering AI innovation at organizations of all sizes, from startups to Fortune 500 companies

Microsoft
Airbnb
Netflix
Stripe
Shopify
Spotify
Slack
Amazon
5.0

"Hanzo's AI platform has transformed our ability to ship AI features quickly. What used to take months now takes days."

JD
Jane Doe
CTO, TechInnovate
5.0

"The observability features are game-changing. We finally have full visibility into our AI systems in production."

MS
Michael Smith
AI Lead, EnterpriseAI
5.0

"Our team went from prototype to production in just days. The SDK is intuitive and the documentation is excellent."

EJ
Emma Johnson
VP Engineering, StartupX

Experiences from Our Community

Hear from engineering teams who are building the next generation of AI-powered applications

"Hanzo gave us the infrastructure backbone to move fast without rebuilding from scratch. The platform let our team focus on the product, not the plumbing."

JG
Jay Giraud
CEO, Damon Motorcycles

"We needed a platform that could handle real-time data at scale without sacrificing developer experience. Hanzo delivered on both fronts."

MW
Marcus Weller
CEO, SKULLY Technologies

"Hanzo's AI infrastructure helped us personalize experiences for millions of users while keeping our stack lean and our team focused on what matters."

SM
Sandro Mur
CEO, Bellabeat

The AI Engineering Community

Join thousands of AI engineers and developers building the future of intelligent applications. Share experiences, get support, and collaborate on best practices.

Active developer community
Weekly office hours
Dedicated support team
JS
MK
AL
TN
RW
+

Start Building the Future of AI

Join thousands of developers and companies who are building intelligent, scalable applications with Hanzo's AI Engineering Platform

Documentation

Comprehensive guides, tutorials, and API references to help you build with Hanzo AI.

Explore Docs

Quickstart

Get up and running quickly with our step-by-step quickstart guides and example projects.

Try Quickstart

Community

Join our growing community of AI engineers, get support, and share your experiences.

Join Community

Ready to get started?

Sign up for free and start building with Hanzo AI today.