Model Cost Profile

Meta: Llama 3.1 8B Instruct

Developer: meta-llama· Tokenizer: Llama3 · Instruct: llama3 · Quantization: fp8

Canonical ID: meta-llama/llama-3.1-8b-instruct

Pricing updated Apr 24, 2026

Input rank: #33Output rank: #34

Live Pricing

Input: $0.0200

Output: $0.0500

HuggingFace ↗View full pricing leaderboard

Last synced Apr 24, 2026 · MMLU score via public benchmark data

Meta: Llama 3.1 8B Instruct is designed for applications requiring extensive context, making it ideal for complex conversational agents, content generation, and data analysis tasks. With a context window of 16,384 tokens, teams can efficiently handle longer inputs and outputs, enhancing the model's ability to maintain coherence in extended dialogues. The pricing structure, at $0.02 per million tokens for input and $0.05 for output, allows teams to budget effectively based on their usage patterns while optimizing costs for high-volume applications.

🔧 Tool Calling🔌 MCP Compatible📋 Structured Output

Context Window

16,384

Input tokens

Full-context input ≈ $0.00

Max Output

16,384

Completion tokens

Input Price / 1M

$0.0200

Prompt tokens

Output Price / 1M

$0.0500

Completion tokens

Top Benchmark

47.6

MMLU score — highest of MMLU, GPQA, MATH, HumanEval

Quality & Benchmarks

Evaluation scores for Meta: Llama 3.1 8B Instruct. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.

Benchmark	Score	Rank	Source
GPQA	25.9	#120 of 125	artificial_analysis
MMLU	47.6	#119 of 127	artificial_analysis

Price History

Meta: Llama 3.1 8B Instruct Pricing Trend

Input / 1M tokens0.0%Output / 1M tokens0.0%

Current Input / 1M

$0.0200

Current Output / 1M

$0.0500

Performance History

Meta: Llama 3.1 8B Instruct Speed Trend

Tokens/sec (higher is better)Latency (lower is better)

Current TPS

0.00

Current Latency

0ms

Uptime

99.9%

Side-by-Side Pricing Table

Usage Type	Price / 1M Tokens
Input (Prompt)	$0.0200
Output (Completion)	$0.0500

Compare with Meta: Llama 3.2 1B Instruct Compare with IBM: Granite 4.0 Micro Compare with LiquidAI: LFM2-24B-A2B

Cost Calculator

Estimate monthly spend for Meta: Llama 3.1 8B Instruct based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$1.10

25M input + 12M output tokens

Same Workload on Other Models

Baidu: Qianfan-OCR-Fast (free)$0.00−$1.10 Free Models Router$0.00−$1.10 Google: Gemma 3 12B (free)$0.00−$1.10 Google: Gemma 3 27B (free)$0.00−$1.10

Cheaper Alternatives to Compare

Quick links for cost-down decisions before production rollout.

Meta: Llama 3.1 8B Instruct vs Baidu: Qianfan-OCR-Fast (free)Meta: Llama 3.1 8B Instruct vs Free Models Router Meta: Llama 3.1 8B Instruct vs Google: Gemma 3 12B (free)Meta: Llama 3.1 8B Instruct vs Google: Gemma 3 27B (free)

Benchmark

Score

Rank

Source

GPQA

25.9

#120 of 125

artificial_analysis

MMLU

47.6

#119 of 127

artificial_analysis

Usage Type

Price / 1M Tokens

Input (Prompt)

$0.0200

Output (Completion)

$0.0500

Cost Calculator

Estimate monthly spend for Meta: Llama 3.1 8B Instruct based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$1.10

25M input + 12M output tokens