Context Window
131,072
Input tokens
Full-context input ≈ $0.02
Model Cost Profile
Developer: qwen· Tokenizer: Qwen3 · Quantization: unknown
Canonical ID: qwen/qwen3-vl-8b-thinking
Pricing updated Apr 23, 2026
Live Pricing
Input: $0.1170
Output: $1.37
Last synced Apr 23, 2026 · MMLU score via public benchmark data
Qwen: Qwen3 VL 8B Thinking is designed for applications requiring extensive context processing, offering a substantial context window of 131072 tokens that supports complex tasks such as document summarization and conversational AI. With an input price of $0.12 per million tokens and an output price of $1.36 per million tokens, teams can effectively budget for high-volume use cases while managing operational costs. This model is particularly suited for businesses needing scalable solutions for natural language understanding and generation, making it a cost-effective choice for data-intensive projects.
Context Window
131,072
Input tokens
Full-context input ≈ $0.02
Max Output
32,768
Completion tokens
Input Price / 1M
$0.1170
Prompt tokens
Output Price / 1M
$1.37
Completion tokens
Top Benchmark
74.9
MMLU score — highest of MMLU, GPQA, MATH, HumanEval
Evaluation scores for Qwen: Qwen3 VL 8B Thinking. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.
Price History
Current Input / 1M
$0.1170
Current Output / 1M
$1.36
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
100.0%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.1170 |
| Output (Completion) | $1.37 |
Estimate monthly spend for Qwen: Qwen3 VL 8B Thinking based on your workload.
Estimated Monthly Cost
$19
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.