NVIDIA: Llama 3.1 Nemotron 70B Instruct

NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and useful responses. Leveraging [Llama 3.1 70B](/models/meta-llama/llama-3.1-70b-instruct) architecture and Reinforcement Learning from Human Feedback (RLHF), it excels...

text

Get API Key View Docs Try in Chat

Specifications

Context Window	131K
Modalities	text
Status	available
Category	third-party
Model ID	nvidia/llama-3.1-nemotron-70b-instruct

Quick Start

TypeScript

import OpenAI from 'openai'

const client = new OpenAI({
  apiKey: process.env.HANZO_API_KEY,
  baseURL: 'https://api.hanzo.ai/v1'
})

const response = await client.chat.completions.create({
  model: 'nvidia/llama-3.1-nemotron-70b-instruct',
  messages: [{ role: 'user', content: 'Hello!' }]
})

console.log(response.choices[0].message.content)

Python

from openai import OpenAI

client = OpenAI(
    api_key=os.environ["HANZO_API_KEY"],
    base_url="https://api.hanzo.ai/v1"
)

response = client.chat.completions.create(
    model="nvidia/llama-3.1-nemotron-70b-instruct",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

cURL

curl https://api.hanzo.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -d '{
    "model": "nvidia/llama-3.1-nemotron-70b-instruct",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

package main

import (
    "context"
    "fmt"
    "os"

    "github.com/sashabaranov/go-openai"
)

func main() {
    cfg := openai.DefaultConfig(os.Getenv("HANZO_API_KEY"))
    cfg.BaseURL = "https://api.hanzo.ai/v1"
    client := openai.NewClientWithConfig(cfg)

    resp, _ := client.CreateChatCompletion(context.Background(),
        openai.ChatCompletionRequest{
            Model: "nvidia/llama-3.1-nemotron-70b-instruct",
            Messages: []openai.ChatCompletionMessage{
                {Role: openai.ChatMessageRoleUser, Content: "Hello!"},
            },
        },
    )
    fmt.Println(resp.Choices[0].Message.Content)
}

More from NVIDIA

NVIDIA: Nemotron 3 Super (free)

262K

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer...

NVIDIA: Nemotron 3 Super

262K

NVIDIA: Nemotron 3 Nano 30B A3B (free)

256K

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully...

NVIDIA: Nemotron 3 Nano 30B A3B

262K

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully...

NVIDIA: Nemotron Nano 12B 2 VL (free)

128K

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a hybrid Transformer-Mamba architecture, combining transformer-level accuracy with Mamba’s...

NVIDIA: Nemotron Nano 12B 2 VL

131K

View all NVIDIA models →

Use NVIDIA: Llama 3.1 Nemotron 70B Instruct via Hanzo AI

One API key. 390+ models. OpenAI-compatible. Start free.

Get Free API Key Browse All Models