Qwen

Qwen: Qwen2.5 Coder 7B Instruct

Qwen2.5-Coder-7B-Instruct is a 7B parameter instruction-tuned language model optimized for code-related tasks such as code generation, reasoning, and bug fixing. Based on the Qwen2.5 architecture, it incorporates enhancements like RoPE, SwiGLU, RMSNorm, and GQA attention with support for up to 128K tokens using YaRN-based extrapolation. It is trained on a large corpus of source code, synthetic data, and text-code grounding, providing robust performance across programming languages and agentic coding workflows. This model is part of the Qwen2.5-Coder family and offers strong compatibility with tools like vLLM for efficient deployment. Released under the Apache 2.0 license.

textcode

Specifications

Context Window33K
Modalitiestext, code
Statusavailable
Categorythird-party
Model IDqwen/qwen2.5-coder-7b-instruct

Quick Start

TypeScript
import OpenAI from 'openai'

const client = new OpenAI({
  apiKey: process.env.HANZO_API_KEY,
  baseURL: 'https://api.hanzo.ai/v1'
})

const response = await client.chat.completions.create({
  model: 'qwen/qwen2.5-coder-7b-instruct',
  messages: [{ role: 'user', content: 'Hello!' }]
})

console.log(response.choices[0].message.content)
Python
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["HANZO_API_KEY"],
    base_url="https://api.hanzo.ai/v1"
)

response = client.chat.completions.create(
    model="qwen/qwen2.5-coder-7b-instruct",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)
cURL
curl https://api.hanzo.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -d '{
    "model": "qwen/qwen2.5-coder-7b-instruct",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
Go
package main

import (
    "context"
    "fmt"
    "os"

    "github.com/sashabaranov/go-openai"
)

func main() {
    cfg := openai.DefaultConfig(os.Getenv("HANZO_API_KEY"))
    cfg.BaseURL = "https://api.hanzo.ai/v1"
    client := openai.NewClientWithConfig(cfg)

    resp, _ := client.CreateChatCompletion(context.Background(),
        openai.ChatCompletionRequest{
            Model: "qwen/qwen2.5-coder-7b-instruct",
            Messages: []openai.ChatCompletionMessage{
                {Role: openai.ChatMessageRoleUser, Content: "Hello!"},
            },
        },
    )
    fmt.Println(resp.Choices[0].Message.Content)
}

More from Qwen

Qwen: Qwen3.5-9B
256K

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture. It uses a unified vision-language design with early fusion of multimodal tokens, allowing the model to process and reason across text and images within the same context.

Qwen: Qwen3.5-35B-A3B
262K

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency. Its overall performance is comparable to that of the Qwen3.5-27B.

Qwen: Qwen3.5-27B
262K

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance. Its overall capabilities are comparable to those of the Qwen3.5-122B-A10B.

Qwen: Qwen3.5-122B-A10B
262K

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. In terms of overall performance, this model is second only to Qwen3.5-397B-A17B. Its text capabilities significantly outperform those of Qwen3-235B-2507, and its visual capabilities surpass those of Qwen3-VL-235B.

Qwen: Qwen3.5-Flash
1M

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the 3 series, these models deliver a leap forward in performance for both pure text and multimodal tasks, offering fast response times while balancing inference speed and overall performance.

Qwen: Qwen3.5 Plus 2026-02-15
1M

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency. In a variety of task evaluations, the 3.5 series consistently demonstrates performance on par with state-of-the-art leading models. Compared to the 3 series, these models show a leap forward in both pure-text and multimodal capabilities.

Use Qwen: Qwen2.5 Coder 7B Instruct via Hanzo AI

One API key. 390+ models. OpenAI-compatible. Start free.