Context Window
131,072
Input tokens
Full-context input ≈ $0.01
Model Cost Profile
Developer: qwen· Tokenizer: Qwen · Quantization: unknown
Canonical ID: qwen/qwen3-vl-32b-instruct
Pricing updated Apr 24, 2026
Live Pricing
Input: $0.1040
Output: $0.4160
Last synced Apr 24, 2026 · MMLU score via public benchmark data
Qwen3 VL 32B Instruct by qwen is designed for applications requiring extensive context handling, with a remarkable context window of 131072 tokens, making it ideal for complex tasks like document summarization and conversational AI. Teams leveraging this API model can expect a cost-effective input price of $0.10 per million tokens, while output pricing stands at $0.42 per million tokens, allowing for scalable budgeting based on usage needs. This model is particularly suited for enterprises that demand high throughput and nuanced understanding in their AI-driven solutions.
Context Window
131,072
Input tokens
Full-context input ≈ $0.01
Max Output
32,768
Completion tokens
Input Price / 1M
$0.1040
Prompt tokens
Output Price / 1M
$0.4160
Completion tokens
Top Benchmark
79.8
MMLU score — highest of MMLU, GPQA, MATH, HumanEval
Evaluation scores for Qwen: Qwen3 VL 32B Instruct. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.
Price History
Current Input / 1M
$0.1040
Current Output / 1M
$0.4160
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
100.0%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.1040 |
| Output (Completion) | $0.4160 |
Estimate monthly spend for Qwen: Qwen3 VL 32B Instruct based on your workload.
Estimated Monthly Cost
$7.59
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.