Meet your
thinking partner.
Tackle any big, bold, bewildering challenge with Hanzo AI.
Complete AI Engineering Toolkit
Everything you need to build, deploy, and manage production-grade AI applications
AI Model Hub
Access a catalog of production-grade foundation models from leading providers and Hanzo's specialized models.
AI Agents
Create autonomous agents that can reason, plan, and execute complex tasks with minimal human intervention.
Optimized Runtime
High-performance inference with automatic batching, caching, and efficient resource utilization.
Vector Database
Built-in vector storage for embeddings with automatic indexing and retrieval optimization.
Evaluation Suite
Comprehensive tools for testing, evaluating, and benchmarking AI models and applications.
AI Observability
Full visibility into AI system behavior with detailed metrics, logging, and performance analytics.
AI Safety & Guardrails
Advanced content filtering, privacy controls, and ethical guardrails for responsible AI deployment.
Enterprise Scale
Built for high-scale production workloads with auto-scaling, high availability, and global distribution.
Developer SDK
Intuitive libraries for Python, TypeScript, and other languages with comprehensive documentation.
Model Serving
Simplified deployment and management of custom models with automatic versioning and A/B testing.
Fine-tuning
User-friendly tools for customizing foundation models to your specific use cases and data.
Unified AI Platform
A complete suite of AI capabilities accessible through a single, consistent API with everything you need to build powerful AI applications
Model Hub Access
One API for thousands of models from top providers and the open-source community
Vector Database
Built-in vector storage with automatic indexing for semantic search and RAG applications
Document Processing
Process, chunk, and index documents in 30+ formats with automatic metadata extraction
Semantic Search
Natural language search across your knowledge base with advanced relevance tuning
Agent Framework
Build autonomous AI agents with reasoning, planning and tool-use capabilities
Code Generation
Specialized models for code completion, refactoring, and documentation
Workflow Orchestration
Chain AI operations with built-in caching, observability, and error handling
Usage Analytics
Comprehensive analytics and cost tracking across all AI operations
Security & Compliance
Enterprise-grade security with data residency options and compliance features
Chat Interfaces
Pre-built chat components with memory management and streaming responses
Prompt Management
Version, test, and optimize prompts across different models and environments
Optimized Edge Serving
Global edge deployment for ultra-low latency AI inference and responses
AI Engineering Platform
Build, deploy, and scale AI applications with an integrated suite of tools designed for modern engineering teams.
AI Model Registry
Model Integration
Integrate with OpenAI, Hugging Face, and other machine learning platforms. Deploy and serve custom ML models with built-in scaling and monitoring.
- •One-click API connections to popular ML services
- •Simple deployment of custom models with containerization
- •Performance optimization for inference workloads
Vector Search
High-performance vector database capabilities for semantic search, RAG applications, and similarity matching across billions of vectors.
- •Advanced indexing for fast k-NN and ANN queries
- •Built-in embeddings generation from text and images
- •Hybrid search combining vector and traditional queries
Data Processing
Specialized data processing pipelines for cleaning, transforming, and enriching training data for machine learning models.
- •Automated ETL workflows for AI data preparation
- •Data versioning and lineage tracking
- •Scalable batch and stream processing
AI-Enhanced Features
Ready-to-use AI capabilities that can be integrated into applications with minimal configuration.
- •Content generation and summarization
- •Image and video analysis with computer vision
- •Real-time anomaly detection and predictive analytics
Unified Model Access
Access thousands of AI models through a single, unified API with consistent interfaces and predictable pricing
Hanzo Zen Models
41 foundation models across language, code, vision, image, audio, speech, and retrieval
Zen5
Next-generation agentic frontier model with native chain-of-thought.
via Hanzo
- 1M+ context window
- Agentic-trained
- MoDE + CoT
- Preview
Zen5 Pro
High-throughput agentic model for demanding production workloads.
via Hanzo
- 512K context window
- Agentic-trained
- Production optimized
- Preview
Zen5 Max
Maximum context agentic model for document-scale analysis.
via Hanzo
- 2M context window
- Extended CoT
- Document-scale
- Preview
Zen5 Ultra
Deepest reasoning model with multi-pass chain-of-thought.
via Hanzo
- 1M context window
- Deep CoT
- Self-verification
- Preview
Zen5 Mini
Efficient agentic model with zen5-class intelligence.
via Hanzo
- 256K context window
- Agentic-trained
- Cost efficient
- Preview
Zen4
Flagship model for complex reasoning and multi-domain tasks.
via Hanzo
- 744B MoE (40B active)
- 202K context window
- Ultra Max tier
- $3 / $9.60 per MTok
Zen4 Ultra
Maximum reasoning with extended chain-of-thought.
via Hanzo
- 744B MoE (40B active) + CoT
- 262K context window
- Ultra Max tier
- Deep reasoning
Zen4 Pro
High-capability model with efficient MoE architecture.
via Hanzo
- 80B MoE (3B active)
- 131K context window
- Ultra tier
- Efficient MoE
Zen4 Max
Most capable model for complex reasoning and agentic tasks.
via Hanzo
- Dense architecture
- 1M context window
- Ultra Max tier
- Agentic coding
Zen4.6
Extended context for long-document analysis and agentic workflows.
via Hanzo
- Dense architecture
- 1M context window
- Ultra tier
- Cost efficient
Zen4 Mini
Ultra-fast lightweight model, ideal for free tier.
via Hanzo
- Dense architecture
- 128K context window
- Starter tier
- Free tier
Zen4 Thinking
Dedicated reasoning with explicit chain-of-thought.
via Hanzo
- 80B MoE (3B active) + CoT
- 131K context window
- Pro Max tier
- Chain-of-thought
Zen4 Coder
Code-specialized MoE for generation, review, and debugging.
via Hanzo
- 480B MoE (35B active)
- 163K context window
- Ultra tier
- Code generation
Zen4 Coder Pro
Full-precision BF16 code model for complex codebases.
via Hanzo
- 480B Dense BF16
- 131K context window
- Ultra Max tier
- Full-precision
Zen4 Coder Flash
Lightweight code model for speed and inline completions.
via Hanzo
- 30B MoE (3B active)
- 262K context window
- Pro Max tier
- Fast completions
Zen3 Omni
Multimodal model supporting text, vision, audio, and structured output.
via Hanzo
- ~200B Dense Multimodal
- 202K context window
- Pro Max tier
- Text + Vision + Audio
Zen3 VL
Vision-language model for image understanding and visual reasoning.
via Hanzo
- 30B MoE (3B active)
- 262K context window
- Pro Max tier
- Image understanding
Zen3 Nano
Ultra-lightweight model for edge deployment.
via Hanzo
- 8B Dense
- 128K context window
- Starter tier
- Free tier
Zen3 Guard
Content safety classifier for moderation and guardrails.
via Hanzo
- 4B Dense
- 65K context window
- Pro tier
- 119 languages
Zen3 Image
Best general-purpose image generation.
via Hanzo
- Diffusion
- Text-to-image
- Image editing
- $0.04/image
Zen3 Image Max
Maximum quality image generation.
via Hanzo
- Diffusion
- Maximum quality
- Professional creative
- $0.08/image
Zen3 Image Dev
Development model for experimentation.
via Hanzo
- Diffusion
- Development
- Iteration
- $0.0005/step
Zen3 Image Fast
Fastest image model for real-time generation.
via Hanzo
- Diffusion
- Ultra-fast
- Real-time
- $0.00035/step
Zen3 Image SDXL
High-resolution image generation at 1024px.
via Hanzo
- Diffusion
- 1024px
- High-resolution
Zen3 Image Playground
Aesthetic model for artistic generation.
via Hanzo
- Diffusion
- Aesthetic
- Artistic
Zen3 Image SSD
Fastest diffusion model for real-time generation.
via Hanzo
- 1B Diffusion
- Fastest
- Real-time
Zen3 Image JP
Japanese-specialized image generation.
via Hanzo
- Diffusion
- Japanese
- Specialized
Zen3 Audio
Best quality speech-to-text transcription.
via Hanzo
- 1.5B ASR
- 100+ languages
- Best accuracy
Zen3 Audio Fast
Fastest speech-to-text for high-throughput.
via Hanzo
- 809M ASR
- Fastest
- Batch optimized
Zen3 ASR
Real-time streaming speech recognition.
via Hanzo
- Streaming ASR
- Real-time
- Sub-500ms latency
Zen3 ASR v1
First-generation streaming ASR.
via Hanzo
- Streaming ASR
- Legacy
- Compatible
Zen3 TTS
High-quality text-to-speech with natural prosody.
via Hanzo
- 82M TTS
- 40+ voices
- 8 languages
Zen3 TTS HD
Maximum fidelity text-to-speech.
via Hanzo
- TTS HD
- Broadcast-grade
- 48kHz output
Zen3 TTS Fast
Low-latency TTS for real-time voice agents.
via Hanzo
- 82M TTS
- Low latency
- Voice agents
Zen3 Embedding
High-quality text embeddings for RAG and search.
via Hanzo
- 3072 dimensions
- 8K context window
- Pro Max tier
Zen3 Embedding Medium
Balanced embedding model for retrieval.
via Hanzo
- 4B parameters
- 40K context window
- Cost-effective
Zen3 Embedding Small
Lightweight embedding for high throughput.
via Hanzo
- 0.6B parameters
- 32K context window
- High-throughput
Zen3 Embedding OpenAI
OpenAI-compatible embedding endpoint.
via Hanzo
- 3072 dimensions
- 8K context window
- OpenAI compatible
Zen3 Reranker
High-quality reranker for RAG pipelines.
via Hanzo
- 8B parameters
- 40K context window
- RAG accuracy
Zen3 Reranker Medium
Balanced reranker for retrieval.
via Hanzo
- 4B parameters
- 40K context window
- Cost-effective
Zen3 Reranker Small
Lightweight reranker for high throughput.
via Hanzo
- 0.6B parameters
- 40K context window
- Minimal cost
Third-Party Models
100+ industry-leading models available through the Hanzo AI Cloud gateway
Claude Opus 4.6
Anthropic's most powerful model for the hardest tasks.
via Anthropic
- 1M context window
- Most capable model
- Complex reasoning
- Extended thinking
Claude Sonnet 4.6
Ideal balance of capability and speed for production workloads.
via Anthropic
- 1M context window
- Best balance of speed and intelligence
- Strong coding
- Fast inference
Claude Haiku 4.5
Fastest and most affordable Claude model for high-throughput tasks.
via Anthropic
- 200K context window
- Fastest Anthropic model
- Cost efficient
- Low latency
GPT-5.3.3
OpenAI's flagship model with advanced reasoning capabilities.
via OpenAI
- 400K context window
- Multimodal
- Advanced reasoning
- Tool use
GPT-5.3.3 Mini
Cost-efficient OpenAI model for everyday tasks.
via OpenAI
- 400K context window
- Fast and affordable
- Good quality
- Low latency
Zen4 Ultra
Advanced reasoning model with extended chain-of-thought.
via Hanzo
- 202K context window
- Reasoning model
- Chain-of-thought
- Math and code
Zen4
Flagship general-purpose model with strong benchmarks.
via Hanzo
- 202K context window
- 744B MoE
- Strong general performance
- Open-weight
Gemini 3.1 Pro
Google's flagship with the longest context window.
via Google
- 1M context window
- Multimodal
- Long-context reasoning
- Code generation
Custom Models
Deploy and customize models to meet your specific needs
Fine-tuned Models
via Custom
- Domain adaptation
- Company knowledge base
- Specialized tasks
- Improved performance
Hugging Face Models
via Custom
- Community models
- Thousands of options
- Specialized capabilities
- Open source
Custom Embedding Models
via Custom
- Domain-specific embeddings
- Custom similarity metrics
- Enhanced search
- Optimized retrieval
Single API for Everything
Our unified API provides direct access to all AI capabilities through a consistent, developer-friendly interface
Model Routing
Smart routing to optimal models based on task, cost, and performance requirements
Document Processing
Built-in document parsing, chunking, and semantic analysis capabilities
Vector Search
Integrated vector database for semantic search and retrieval augmented generation
Knowledge Base
Create, manage and query custom knowledge bases for your AI applications
Versatile AI Use Cases
Hanzo's AI platform supports a wide range of intelligent applications across industries
Conversational AI
Build intelligent chatbots, virtual assistants, and customer support agents with natural language understanding.
Generative Content
Create text, images, code, and other content with AI-powered generation and customization.
Knowledge Retrieval
Implement semantic search, question answering, and information extraction from your data.
Autonomous Agents
Deploy AI agents that can perform complex tasks, make decisions, and execute workflows autonomously.
Developer Tooling
Enhance your development workflow with AI-powered code generation, debugging, and documentation.
Voice & Speech
Convert speech to text, text to speech, and analyze voice interactions with advanced AI models.
Simple Implementation
Build powerful AI applications with just a few lines of code using our intuitive SDK
import { Hanzo } from '@hanzo/ai';
// Initialize the Hanzo AI client
const hanzo = new Hanzo({
apiKey: process.env.HANZO_API_KEY
});
// Create a conversation with memory
const conversation = hanzo.conversation({
model: 'gpt-5.3',
memory: true,
system: 'You are a helpful assistant'
});
// Send a message and get a response
const response = await conversation.send('Tell me about AI engineering');
console.log(response);Documentation Example
Vector Search
// Create a vector store
const vectorStore = hanzo.vectorStore('my-store');
// Add documents to the store
await vectorStore.addDocuments([
{ text: 'AI engineering best practices...' },
{ text: 'Deploying models to production...' }
]);
// Search for similar documents
const results = await vectorStore.search(
'How to deploy AI models?',
{ limit: 3 }
);AI Agents
// Create an agent with tools
const agent = hanzo.agent({
model: 'claude-opus-4-6',
tools: [
hanzo.tools.webSearch(),
hanzo.tools.codeInterpreter(),
vectorStore.asTool('knowledge')
]
});
// Run the agent with a task
const result = await agent.run(
'Analyze our production metrics and suggest optimizations'
);Trusted by Industry Leaders
Powering AI innovation at organizations of all sizes, from startups to Fortune 500 companies
"Hanzo's AI platform has transformed our ability to ship AI features quickly. What used to take months now takes days."
"The observability features are game-changing. We finally have full visibility into our AI systems in production."
"Our team went from prototype to production in just days. The SDK is intuitive and the documentation is excellent."
Experiences from Our Community
Hear from engineering teams who are building the next generation of AI-powered applications
"Hanzo gave us the infrastructure backbone to move fast without rebuilding from scratch. The platform let our team focus on the product, not the plumbing."
"We needed a platform that could handle real-time data at scale without sacrificing developer experience. Hanzo delivered on both fronts."
"Hanzo's AI infrastructure helped us personalize experiences for millions of users while keeping our stack lean and our team focused on what matters."
The AI Engineering Community
Join thousands of AI engineers and developers building the future of intelligent applications. Share experiences, get support, and collaborate on best practices.
Start Building the Future of AI
Join thousands of developers and companies who are building intelligent, scalable applications with Hanzo's AI Engineering Platform
Documentation
Comprehensive guides, tutorials, and API references to help you build with Hanzo AI.
Explore DocsQuickstart
Get up and running quickly with our step-by-step quickstart guides and example projects.
Try QuickstartCommunity
Join our growing community of AI engineers, get support, and share your experiences.
Join CommunityReady to get started?
Sign up for free and start building with Hanzo AI today.