🤗
AI SDKs

Use Hanzo AI with Hugging Face

Access Hugging Face models and use the HF Inference API through Hanzo's unified gateway. Also use Hanzo AI models with HuggingFace Hub tooling.

Base URL: https://api.hanzo.ai/v1

API Key: Get yours at hanzo.ai/signup · Fully OpenAI-compatible · 390+ models available

🤗

Created by Hugging Face

License: Apache-2.0 · View source on GitHub →

Hanzo AI is OpenAI-compatible, so existing Hugging Face code works with zero refactoring. We deeply appreciate the Hugging Face team for building and maintaining this open-source project.

InferenceClient via Hanzo

python
pip install huggingface_hub

from huggingface_hub import InferenceClient

client = InferenceClient(
    base_url="https://api.hanzo.ai/v1",
    api_key="your-hanzo-api-key",
)

output = client.chat.completions.create(
    model="meta-llama/llama-4-scout",
    messages=[{"role": "user", "content": "Hello!"}],
)

hf CLI with Hanzo endpoint

python
# Use hf for model downloads, Hanzo for inference
hf download meta-llama/Llama-4-Scout-17B-16E-Instruct

# Then serve via Hanzo API
from openai import OpenAI
client = OpenAI(
    base_url="https://api.hanzo.ai/v1",
    api_key="your-hanzo-api-key",
)

JS InferenceClient

typescript
import { InferenceClient } from "@huggingface/inference";

const client = new InferenceClient("your-hanzo-api-key");

const chatCompletion = await client.chatCompletion({
  model: "meta-llama/llama-4-scout",
  messages: [{ role: "user", content: "Hello!" }],
  provider: "hanzo",
  endpointUrl: "https://api.hanzo.ai/v1",
});

cURL inference

bash
curl https://api.hanzo.ai/v1/chat/completions \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"meta-llama/llama-4-maverick","messages":[{"role":"user","content":"Hi"}]}'

Ready to get started?

Create a free account and get your API key. 100K API calls/month free forever.