Context Window
128,000
Input tokens
Full-context input ≈ $0.00
Model Cost Profile
Developer: nvidia· Tokenizer: Other · Quantization: bf16
Canonical ID: nvidia/nemotron-nano-9b-v2
Pricing updated Apr 22, 2026
Live Pricing
Input: $0.0000
Output: $0.0000
Last synced Apr 22, 2026 · MMLU score via public benchmark data
The NVIDIA Nemotron Nano 9B V2 offers a substantial context window of 128,000 tokens, making it suitable for applications requiring extensive text analysis or long-form content generation. As a free API model, it provides teams with a cost-effective solution for projects that demand high token limits without the burden of input or output fees. This model is ideal for developers looking to integrate advanced natural language processing capabilities into chatbots, document summarization tools, or any application that benefits from processing large volumes of text efficiently.
Context Window
128,000
Input tokens
Full-context input ≈ $0.00
Max Output
—
Not specified
Input Price / 1M
$0.0000
Prompt tokens
Output Price / 1M
$0.0000
Completion tokens
Top Benchmark
74.2
MMLU score — highest of MMLU, GPQA, MATH, HumanEval
Evaluation scores for NVIDIA: Nemotron Nano 9B V2 (free). The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.
Price History
Current Input / 1M
$0.000000
Current Output / 1M
$0.000000
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
97.2%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.0000 |
| Output (Completion) | $0.0000 |
Estimate monthly spend for NVIDIA: Nemotron Nano 9B V2 (free) based on your workload.
Estimated Monthly Cost
$0.00
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.