Model Cost Profile

Qwen: Qwen3 32B

Developer: qwen· Tokenizer: Qwen3 · Instruct: qwen3 · Quantization: fp8

Canonical ID: qwen/qwen3-32b-04-28

Pricing updated Apr 24, 2026

Input rank: #73Output rank: #72

Live Pricing

Input: $0.0800

Output: $0.2400

Visit Qwen ↗HuggingFace ↗View full pricing leaderboard

Last synced Apr 24, 2026 · MMLU score via public benchmark data

Qwen3 32B, developed by qwen, is designed for applications requiring extensive context management, featuring a context window of 40,960 tokens, making it suitable for complex tasks such as document summarization and conversational AI. With an input price of $0.08 per million tokens and an output price of $0.24 per million tokens, teams can effectively budget for large-scale projects while leveraging the model's capabilities for high-volume data processing. This pricing structure allows organizations to optimize their costs based on usage patterns, ensuring efficient resource allocation for API integrations.

💡 Enable prompt caching to save 50% on repeated input tokens ($0.0400/M cached vs $0.0800/M standard).

🔧 Tool Calling🔌 MCP Compatible📋 Structured Output🧠 Reasoning

Context Window

40,960

Input tokens

Full-context input ≈ $0.00

Max Output

40,960

Completion tokens

Input Price / 1M

$0.0800

Prompt tokens

Output Price / 1M

$0.2400

Completion tokens

Top Benchmark

72.7

MMLU score — highest of MMLU, GPQA, MATH, HumanEval

Quality & Benchmarks

Evaluation scores for Qwen: Qwen3 32B. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.

Benchmark	Score	Rank	Source
GPQA	53.5	#84 of 125	artificial_analysis
MMLU	72.7	#82 of 127	artificial_analysis

Price History

Qwen: Qwen3 32B Pricing Trend

Input / 1M tokens0.0%Output / 1M tokens0.0%

Current Input / 1M

$0.0800

Current Output / 1M

$0.2400

Performance History

Qwen: Qwen3 32B Speed Trend

Tokens/sec (higher is better)Latency (lower is better)

Current TPS

0.00

Current Latency

0ms

Uptime

97.4%

Side-by-Side Pricing Table

Usage Type	Price / 1M Tokens
Input (Prompt)	$0.0800
Output (Completion)	$0.2400
Cache Read	$0.0400

Compare with Qwen: Qwen3 30B A3B Compare with Google: Gemma 3 27B Compare with Meta: Llama 4 Scout

Cost Calculator

Estimate monthly spend for Qwen: Qwen3 32B based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$4.88

25M input + 12M output tokens

Same Workload on Other Models

Baidu: Qianfan-OCR-Fast (free)$0.00−$4.88 Free Models Router$0.00−$4.88 Google: Gemma 3 12B (free)$0.00−$4.88 Google: Gemma 3 27B (free)$0.00−$4.88

Cheaper Alternatives to Compare

Quick links for cost-down decisions before production rollout.

Qwen: Qwen3 32B vs Baidu: Qianfan-OCR-Fast (free)Qwen: Qwen3 32B vs Free Models Router Qwen: Qwen3 32B vs Google: Gemma 3 12B (free)Qwen: Qwen3 32B vs Google: Gemma 3 27B (free)

Benchmark

Score

Rank

Source

GPQA

53.5

#84 of 125

artificial_analysis

MMLU

72.7

#82 of 127

artificial_analysis

Usage Type

Price / 1M Tokens

Input (Prompt)

$0.0800

Output (Completion)

$0.2400

Cache Read

$0.0400

Cost Calculator

Estimate monthly spend for Qwen: Qwen3 32B based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$4.88

25M input + 12M output tokens