Context Window
40,960
Input tokens
Full-context input ≈ $0.00
Model Cost Profile
Developer: qwen· Tokenizer: Qwen3 · Instruct: qwen3 · Quantization: fp8
Canonical ID: qwen/qwen3-8b-04-28
Pricing updated Apr 23, 2026
Live Pricing
Input: $0.0500
Output: $0.4000
Last synced Apr 23, 2026 · MMLU score via public benchmark data
Qwen3 8B, developed by Qwen, offers a substantial context window of 32,000 tokens, making it ideal for applications requiring extensive text analysis or generation, such as legal document review and long-form content creation. With an input price of $0.05 per million tokens and an output price of $0.40 per million tokens, teams can optimize their budget while leveraging this model for high-volume tasks. Its pricing structure allows businesses to scale their usage effectively, accommodating both small projects and large-scale deployments.
Context Window
40,960
Input tokens
Full-context input ≈ $0.00
Max Output
8,192
Completion tokens
Input Price / 1M
$0.0500
Prompt tokens
Output Price / 1M
$0.4000
Completion tokens
Top Benchmark
64.3
MMLU score — highest of MMLU, GPQA, MATH, HumanEval
Evaluation scores for Qwen: Qwen3 8B. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.
Price History
Current Input / 1M
$0.0500
Current Output / 1M
$0.4000
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
99.9%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.0500 |
| Output (Completion) | $0.4000 |
| Cache Read | $0.0500 |
Estimate monthly spend for Qwen: Qwen3 8B based on your workload.
Estimated Monthly Cost
$6.05
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.