Context Window
131,072
Input tokens
Full-context input ≈ $0.01
Model Cost Profile
Developer: qwen· Tokenizer: Qwen3 · Quantization: fp8
Canonical ID: qwen/qwen3-vl-8b-instruct
Pricing updated Apr 23, 2026
Live Pricing
Input: $0.0800
Output: $0.5000
Last synced Apr 23, 2026 · MMLU score via public benchmark data
Qwen3 VL 8B Instruct, developed by qwen, is designed for applications requiring extensive context, accommodating up to 131,072 tokens for complex tasks such as document summarization and conversational agents. Teams utilizing this API model can expect an input cost of $0.08 per million tokens and an output cost of $0.50 per million tokens, making it suitable for projects with significant data processing needs. Its advanced instruction-following capabilities enable efficient handling of diverse use cases, from natural language understanding to content generation.
Context Window
131,072
Input tokens
Full-context input ≈ $0.01
Max Output
32,768
Completion tokens
Input Price / 1M
$0.0800
Prompt tokens
Output Price / 1M
$0.5000
Completion tokens
Top Benchmark
74.3
MMLU score — highest of MMLU, GPQA, MATH, HumanEval
Evaluation scores for Qwen: Qwen3 VL 8B Instruct. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.
Price History
Current Input / 1M
$0.0800
Current Output / 1M
$0.5000
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
93.3%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.0800 |
| Output (Completion) | $0.5000 |
Estimate monthly spend for Qwen: Qwen3 VL 8B Instruct based on your workload.
Estimated Monthly Cost
$8.00
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.