LiquidAI: LFM2-8B-A1B
LFM2-8B-A1B is an efficient on-device Mixture-of-Experts (MoE) model from Liquid AI’s LFM2 family, built for fast, high-quality inference on edge hardware. It uses 8.3B total parameters with only ~1.5B active per token, delivering strong performance while keeping compute and memory usage low—making it ideal for phones, tablets, and laptops.
Specifications
| Context Window | 33K |
| Modalities | text |
| Status | available |
| Category | third-party |
| Model ID | liquid/lfm2-8b-a1b |
Quick Start
import OpenAI from 'openai'
const client = new OpenAI({
apiKey: process.env.HANZO_API_KEY,
baseURL: 'https://api.hanzo.ai/v1'
})
const response = await client.chat.completions.create({
model: 'liquid/lfm2-8b-a1b',
messages: [{ role: 'user', content: 'Hello!' }]
})
console.log(response.choices[0].message.content)from openai import OpenAI
client = OpenAI(
api_key=os.environ["HANZO_API_KEY"],
base_url="https://api.hanzo.ai/v1"
)
response = client.chat.completions.create(
model="liquid/lfm2-8b-a1b",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)curl https://api.hanzo.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $HANZO_API_KEY" \
-d '{
"model": "liquid/lfm2-8b-a1b",
"messages": [{"role": "user", "content": "Hello!"}]
}'package main
import (
"context"
"fmt"
"os"
"github.com/sashabaranov/go-openai"
)
func main() {
cfg := openai.DefaultConfig(os.Getenv("HANZO_API_KEY"))
cfg.BaseURL = "https://api.hanzo.ai/v1"
client := openai.NewClientWithConfig(cfg)
resp, _ := client.CreateChatCompletion(context.Background(),
openai.ChatCompletionRequest{
Model: "liquid/lfm2-8b-a1b",
Messages: []openai.ChatCompletionMessage{
{Role: openai.ChatMessageRoleUser, Content: "Hello!"},
},
},
)
fmt.Println(resp.Choices[0].Message.Content)
}More from Liquid AI
LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per token, it delivers high-quality generation while maintaining low inference costs. The model fits within 32 GB of RAM, making it practical to run on consumer laptops and desktops without sacrificing capability.
LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—while still running comfortably on edge devices. It supports long context (up to 32K tokens) and is designed to provide higher-quality “thinking” responses in a small 1.2B model.
LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fast on-device AI. It delivers strong chat quality in a 1.2B parameter footprint, with efficient edge inference and broad runtime support.
LFM2 is a new generation of hybrid models developed by Liquid AI, specifically designed for edge AI and on-device deployment. It sets a new standard in terms of quality, speed, and memory efficiency.
Use LiquidAI: LFM2-8B-A1B via Hanzo AI
One API key. 390+ models. OpenAI-compatible. Start free.