Context Window
131,072
Tokens
Model Cost Profile
Developer: nvidia
Pricing updated Mar 11, 2026
Live Pricing
Input: $1.20
Output: $1.20
Pricing via OpenRouter API ยท Last synced Mar 11, 2026 ยท MMLU score via public benchmark data
The NVIDIA Llama 3.1 Nemotron 70B Instruct model offers a substantial context window of 131,072 tokens, making it ideal for applications requiring extensive text comprehension, such as legal document analysis or long-form content generation. With a competitive input and output pricing of $1.20 per 1 million tokens, teams can effectively manage costs while leveraging the model for high-volume tasks like customer support automation and data summarization. This model's advanced capabilities are particularly beneficial for organizations needing to process large datasets or maintain context over extended interactions.
Context Window
131,072
Tokens
Input Price / 1M
$1.20
Prompt tokens
Output Price / 1M
$1.20
Completion tokens
Intelligence (MMLU)
71.3
Massive Multitask Language Understanding
Standardized evaluation scores for NVIDIA: Llama 3.1 Nemotron 70B Instruct.
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $1.20 |
| Output (Completion) | $1.20 |
Price History
Current Input / 1M
$1.20
Current Output / 1M
$1.20
Estimate monthly spend for NVIDIA: Llama 3.1 Nemotron 70B Instruct based on your workload.
Estimated Monthly Cost
$44
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.
Common pricing and benchmark questions for NVIDIA: Llama 3.1 Nemotron 70B Instruct.
NVIDIA: Llama 3.1 Nemotron 70B Instruct input pricing is $1.20 per 1M tokens based on the latest synced provider data.
NVIDIA: Llama 3.1 Nemotron 70B Instruct output pricing is $1.20 per 1M tokens based on the latest synced provider data.
NVIDIA: Llama 3.1 Nemotron 70B Instruct supports a context window of 131,072 tokens.
Use the comparison links on this page to open direct model-vs-model pricing and benchmark pages, then evaluate monthly spend projections for your workload.