Model Cost Profile

NVIDIA: Llama 3.1 Nemotron 70B Instruct

Developer: nvidia

Pricing updated Mar 11, 2026

Input rank: #252Output rank: #188

Live Pricing

Input: $1.20

Output: $1.20

Pricing via OpenRouter API ยท Last synced Mar 11, 2026 ยท MMLU score via public benchmark data

The NVIDIA Llama 3.1 Nemotron 70B Instruct model offers a substantial context window of 131,072 tokens, making it ideal for applications requiring extensive text comprehension, such as legal document analysis or long-form content generation. With a competitive input and output pricing of $1.20 per 1 million tokens, teams can effectively manage costs while leveraging the model for high-volume tasks like customer support automation and data summarization. This model's advanced capabilities are particularly beneficial for organizations needing to process large datasets or maintain context over extended interactions.

๐Ÿ”ง Tool Calling๐Ÿ“‹ Structured Output

Context Window

131,072

Tokens

Input Price / 1M

$1.20

Prompt tokens

Output Price / 1M

$1.20

Completion tokens

Intelligence (MMLU)

71.3

Massive Multitask Language Understanding

Benchmark Scores

Standardized evaluation scores for NVIDIA: Llama 3.1 Nemotron 70B Instruct.

BenchmarkScoreRankSource
GPQA49.8#87 of 118artificial_analysis
MMLU71.3#80 of 121artificial_analysis

Price History

NVIDIA: Llama 3.1 Nemotron 70B Instruct Pricing Trend

Input / 1M tokens0.0%Output / 1M tokens0.0%
Mar 7 โ€” Mar 11
$1.20$1.20$1.20Mar 7Mar 8Mar 9Mar 10Mar 11

Current Input / 1M

$1.20

Current Output / 1M

$1.20

Cheaper Alternatives to Compare

Quick links for cost-down decisions before production rollout.

FAQ

Common pricing and benchmark questions for NVIDIA: Llama 3.1 Nemotron 70B Instruct.

How much does NVIDIA: Llama 3.1 Nemotron 70B Instruct cost per 1M input tokens?

NVIDIA: Llama 3.1 Nemotron 70B Instruct input pricing is $1.20 per 1M tokens based on the latest synced provider data.

How much does NVIDIA: Llama 3.1 Nemotron 70B Instruct cost per 1M output tokens?

NVIDIA: Llama 3.1 Nemotron 70B Instruct output pricing is $1.20 per 1M tokens based on the latest synced provider data.

What context window does NVIDIA: Llama 3.1 Nemotron 70B Instruct support?

NVIDIA: Llama 3.1 Nemotron 70B Instruct supports a context window of 131,072 tokens.

How can I compare NVIDIA: Llama 3.1 Nemotron 70B Instruct with cheaper alternatives?

Use the comparison links on this page to open direct model-vs-model pricing and benchmark pages, then evaluate monthly spend projections for your workload.