Model Cost Profile

Meta: Llama 3.2 3B Instruct

Developer: meta-llama· Tokenizer: Llama3 · Instruct: llama3 · Quantization: unknown

Canonical ID: meta-llama/llama-3.2-3b-instruct

Pricing updated Apr 24, 2026

Input rank: #52Output rank: #93

Live Pricing

Input: $0.0510

Output: $0.3400

HuggingFace ↗View full pricing leaderboard

Last synced Apr 24, 2026 · MMLU score via public benchmark data

Meta: Llama 3.2 3B Instruct is designed for applications requiring extensive context handling, featuring a context window of 131,072 tokens, making it suitable for complex dialogue systems and detailed content generation. With a competitive pricing structure of $0.02 per million tokens for both input and output, it offers cost-effective solutions for teams looking to scale their AI capabilities without breaking the budget. This model is particularly beneficial for businesses in sectors like customer support and content creation, where nuanced understanding and extensive context are crucial for delivering high-quality interactions.

Context Window

80,000

Input tokens

Full-context input ≈ $0.00

Max Output

—

Not specified

Input Price / 1M

$0.0510

Prompt tokens

Output Price / 1M

$0.3400

Completion tokens

Top Benchmark

34.7

MMLU score — highest of MMLU, GPQA, MATH, HumanEval

Quality & Benchmarks

Evaluation scores for Meta: Llama 3.2 3B Instruct. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.

Benchmark	Score	Rank	Source
GPQA	25.5	#122 of 125	artificial_analysis
MMLU	34.7	#125 of 127	artificial_analysis

Price History

Meta: Llama 3.2 3B Instruct Pricing Trend

Input / 1M tokens0.0%Output / 1M tokens0.0%

Current Input / 1M

$0.0510

Current Output / 1M

$0.3400

Performance History

Meta: Llama 3.2 3B Instruct Speed Trend

Tokens/sec (higher is better)Latency (lower is better)

Current TPS

0.00

Current Latency

0ms

Uptime

100.0%

Side-by-Side Pricing Table

Usage Type	Price / 1M Tokens
Input (Prompt)	$0.0510
Output (Completion)	$0.3400

Compare with Meta: Llama 3 8B Instruct Compare with Mistral: Mistral Small 3 Compare with NVIDIA: Nemotron 3 Nano 30B A3B

Cost Calculator

Estimate monthly spend for Meta: Llama 3.2 3B Instruct based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$5.36

25M input + 12M output tokens

Same Workload on Other Models

Baidu: Qianfan-OCR-Fast (free)$0.00−$5.36 Free Models Router$0.00−$5.36 Google: Gemma 3 12B (free)$0.00−$5.36 Google: Gemma 3 27B (free)$0.00−$5.36

Cheaper Alternatives to Compare

Quick links for cost-down decisions before production rollout.

Meta: Llama 3.2 3B Instruct vs Baidu: Qianfan-OCR-Fast (free)Meta: Llama 3.2 3B Instruct vs Free Models Router Meta: Llama 3.2 3B Instruct vs Google: Gemma 3 12B (free)Meta: Llama 3.2 3B Instruct vs Google: Gemma 3 27B (free)

Benchmark

Score

Rank

Source

GPQA

25.5

#122 of 125

artificial_analysis

MMLU

34.7

#125 of 127

artificial_analysis

Usage Type

Price / 1M Tokens

Input (Prompt)

$0.0510

Output (Completion)

$0.3400

Cost Calculator

Estimate monthly spend for Meta: Llama 3.2 3B Instruct based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$5.36

25M input + 12M output tokens