Context Window
40,960
Input tokens
Full-context input ≈ $0.00
Model Cost Profile
Developer: qwen· Tokenizer: Qwen3 · Instruct: qwen3 · Quantization: fp8
Canonical ID: qwen/qwen3-32b-04-28
Pricing updated Apr 24, 2026
Live Pricing
Input: $0.0800
Output: $0.2400
Last synced Apr 24, 2026 · MMLU score via public benchmark data
Qwen3 32B, developed by qwen, is designed for applications requiring extensive context management, featuring a context window of 40,960 tokens, making it suitable for complex tasks such as document summarization and conversational AI. With an input price of $0.08 per million tokens and an output price of $0.24 per million tokens, teams can effectively budget for large-scale projects while leveraging the model's capabilities for high-volume data processing. This pricing structure allows organizations to optimize their costs based on usage patterns, ensuring efficient resource allocation for API integrations.
Context Window
40,960
Input tokens
Full-context input ≈ $0.00
Max Output
40,960
Completion tokens
Input Price / 1M
$0.0800
Prompt tokens
Output Price / 1M
$0.2400
Completion tokens
Top Benchmark
72.7
MMLU score — highest of MMLU, GPQA, MATH, HumanEval
Evaluation scores for Qwen: Qwen3 32B. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.
Price History
Current Input / 1M
$0.0800
Current Output / 1M
$0.2400
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
97.4%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.0800 |
| Output (Completion) | $0.2400 |
| Cache Read | $0.0400 |
Estimate monthly spend for Qwen: Qwen3 32B based on your workload.
Estimated Monthly Cost
$4.88
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.