Context Window
128,000
Input tokens
Full-context input ≈ $0.00
Model Cost Profile
Developer: nvidia· Tokenizer: Other · Quantization: unknown
Canonical ID: nvidia/nemotron-nano-12b-v2-vl
Pricing updated Apr 23, 2026
Live Pricing
Input: $0.0000
Output: $0.0000
Last synced Apr 23, 2026 · MMLU score via public benchmark data
The NVIDIA Nemotron Nano 12B 2 VL model offers a substantial context window of 128,000 tokens, making it ideal for applications requiring extensive text analysis or long-form content generation. With no associated input or output costs, teams can leverage this free API model for projects in natural language processing, chatbots, and data extraction without worrying about budget constraints. Its high token capacity allows for complex tasks, enabling users to handle larger datasets and maintain context over extended interactions.
Context Window
128,000
Input tokens
Full-context input ≈ $0.00
Max Output
128,000
Completion tokens
Input Price / 1M
$0.0000
Prompt tokens
Output Price / 1M
$0.0000
Completion tokens
Top Benchmark
75.9
MMLU score — highest of MMLU, GPQA, MATH, HumanEval
Evaluation scores for NVIDIA: Nemotron Nano 12B 2 VL (free). The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.
Price History
Current Input / 1M
$0.000000
Current Output / 1M
$0.000000
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
98.3%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.0000 |
| Output (Completion) | $0.0000 |
Estimate monthly spend for NVIDIA: Nemotron Nano 12B 2 VL (free) based on your workload.
Estimated Monthly Cost
$0.00
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.