Model Cost Profile

Qwen: Qwen3 8B

Developer: qwen· Tokenizer: Qwen3 · Instruct: qwen3 · Quantization: fp8

Canonical ID: qwen/qwen3-8b-04-28

Pricing updated Apr 23, 2026

Input rank: #49Output rank: #105

Live Pricing

Input: $0.0500

Output: $0.4000

Visit Qwen ↗HuggingFace ↗View full pricing leaderboard

Last synced Apr 23, 2026 · MMLU score via public benchmark data

Qwen3 8B, developed by Qwen, offers a substantial context window of 32,000 tokens, making it ideal for applications requiring extensive text analysis or generation, such as legal document review and long-form content creation. With an input price of $0.05 per million tokens and an output price of $0.40 per million tokens, teams can optimize their budget while leveraging this model for high-volume tasks. Its pricing structure allows businesses to scale their usage effectively, accommodating both small projects and large-scale deployments.

🔧 Tool Calling🔌 MCP Compatible📋 Structured Output🧠 Reasoning

Context Window

40,960

Input tokens

Full-context input ≈ $0.00

Max Output

8,192

Completion tokens

Input Price / 1M

$0.0500

Prompt tokens

Output Price / 1M

$0.4000

Completion tokens

Top Benchmark

64.3

MMLU score — highest of MMLU, GPQA, MATH, HumanEval

Quality & Benchmarks

Evaluation scores for Qwen: Qwen3 8B. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.

Benchmark	Score	Rank	Source
GPQA	45.2	#107 of 125	artificial_analysis
MMLU	64.3	#112 of 127	artificial_analysis

Price History

Qwen: Qwen3 8B Pricing Trend

Input / 1M tokens0.0%Output / 1M tokens0.0%

Current Input / 1M

$0.0500

Current Output / 1M

$0.4000

Performance History

Qwen: Qwen3 8B Speed Trend

Tokens/sec (higher is better)Latency (lower is better)

Current TPS

0.00

Current Latency

0ms

Uptime

99.9%

Side-by-Side Pricing Table

Usage Type	Price / 1M Tokens
Input (Prompt)	$0.0500
Output (Completion)	$0.4000
Cache Read	$0.0500

Compare with Qwen: Qwen2.5 7B Instruct Compare with Mistral: Mistral Small 3 Compare with NVIDIA: Nemotron 3 Nano 30B A3B

Cost Calculator

Estimate monthly spend for Qwen: Qwen3 8B based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$6.05

25M input + 12M output tokens

Same Workload on Other Models

Arcee AI: Trinity Large Preview (free)$0.00−$6.05 Free Models Router$0.00−$6.05 Google: Gemma 3 12B (free)$0.00−$6.05 Google: Gemma 3 27B (free)$0.00−$6.05

Cheaper Alternatives to Compare

Quick links for cost-down decisions before production rollout.

Qwen: Qwen3 8B vs Arcee AI: Trinity Large Preview (free)Qwen: Qwen3 8B vs Free Models Router Qwen: Qwen3 8B vs Google: Gemma 3 12B (free)Qwen: Qwen3 8B vs Google: Gemma 3 27B (free)

Benchmark

Score

Rank

Source

GPQA

45.2

#107 of 125

artificial_analysis

MMLU

64.3

#112 of 127

artificial_analysis

Usage Type

Price / 1M Tokens

Input (Prompt)

$0.0500

Output (Completion)

$0.4000

Cache Read

$0.0500