Model Cost Profile

Meta: Llama 3.3 70B Instruct

Developer: meta-llama· Tokenizer: Llama3 · Instruct: llama3 · Quantization: fp8

Canonical ID: meta-llama/llama-3.3-70b-instruct

Pricing updated Apr 23, 2026

Input rank: #84Output rank: #89

Live Pricing

Input: $0.1000

Output: $0.3200

HuggingFace ↗View full pricing leaderboard

Last synced Apr 23, 2026 · MMLU score via public benchmark data

Meta's Llama 3.3 70B Instruct model, developed by meta-llama, offers a substantial context window of 131,072 tokens, making it ideal for applications requiring extensive text comprehension and generation, such as legal document analysis and long-form content creation. With an input price of $0.10 per 1 million tokens and an output price of $0.32 per 1 million tokens, teams can effectively manage their budgets while leveraging the model's capabilities for complex tasks. This pricing structure allows organizations to scale their usage according to project demands, optimizing costs while enhancing productivity in data-intensive environments.

🔧 Tool Calling🔌 MCP Compatible📋 Structured Output

Context Window

131,072

Input tokens

Full-context input ≈ $0.01

Max Output

16,384

Completion tokens

Input Price / 1M

$0.1000

Prompt tokens

Output Price / 1M

$0.3200

Completion tokens

Top Benchmark

71.3

MMLU score — highest of MMLU, GPQA, MATH, HumanEval

Quality & Benchmarks

Evaluation scores for Meta: Llama 3.3 70B Instruct. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.

Benchmark	Score	Rank	Source
GPQA	49.8	#92 of 125	artificial_analysis
MMLU	71.3	#87 of 127	artificial_analysis

Price History

Meta: Llama 3.3 70B Instruct Pricing Trend

Input / 1M tokens0.0%Output / 1M tokens0.0%

Current Input / 1M

$0.1000

Current Output / 1M

$0.3200

Performance History

Meta: Llama 3.3 70B Instruct Speed Trend

Tokens/sec (higher is better)Latency (lower is better)

Current TPS

0.00

Current Latency

0ms

Uptime

99.2%

Side-by-Side Pricing Table

Usage Type	Price / 1M Tokens
Input (Prompt)	$0.1000
Output (Completion)	$0.3200

Compare with Meta: Llama 4 Scout Compare with ByteDance Seed: Seed-2.0-Mini Compare with ByteDance: UI-TARS 7B

Cost Calculator

Estimate monthly spend for Meta: Llama 3.3 70B Instruct based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$6.34

25M input + 12M output tokens

Same Workload on Other Models

Arcee AI: Trinity Large Preview (free)$0.00−$6.34 Free Models Router$0.00−$6.34 Google: Gemma 3 12B (free)$0.00−$6.34 Google: Gemma 3 27B (free)$0.00−$6.34

Cheaper Alternatives to Compare

Quick links for cost-down decisions before production rollout.

Meta: Llama 3.3 70B Instruct vs Arcee AI: Trinity Large Preview (free)Meta: Llama 3.3 70B Instruct vs Free Models Router Meta: Llama 3.3 70B Instruct vs Google: Gemma 3 12B (free)Meta: Llama 3.3 70B Instruct vs Google: Gemma 3 27B (free)

Benchmark

Score

Rank

Source

GPQA

49.8

#92 of 125

artificial_analysis

MMLU

71.3

#87 of 127

artificial_analysis

Usage Type

Price / 1M Tokens

Input (Prompt)

$0.1000

Output (Completion)

$0.3200

Cost Calculator

Estimate monthly spend for Meta: Llama 3.3 70B Instruct based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$6.34

25M input + 12M output tokens