Context Window
32,000
Input tokens
Full-context input ≈ $0.01
Model Cost Profile
Developer: qwen· Tokenizer: Qwen · Quantization: fp8
Canonical ID: qwen/qwen2.5-vl-72b-instruct
Pricing updated Apr 24, 2026
Live Pricing
Input: $0.2500
Output: $0.7500
Last synced Apr 24, 2026 · MMLU score via public benchmark data
Qwen2.5 VL 72B Instruct, developed by qwen, is designed for applications requiring extensive context handling, supporting a context window of 32,768 tokens. This model is particularly suited for complex tasks such as document summarization, conversational agents, and multi-turn dialogue systems, making it ideal for teams that need to process large volumes of text data efficiently. With a competitive pricing structure of $0.80 per million tokens for both input and output, organizations can effectively manage their budget while leveraging advanced AI capabilities.
Context Window
32,000
Input tokens
Full-context input ≈ $0.01
Max Output
—
Not specified
Input Price / 1M
$0.2500
Prompt tokens
Output Price / 1M
$0.7500
Completion tokens
Top Benchmark
72.0
MMLU score — highest of MMLU, GPQA, MATH, HumanEval
Evaluation scores for Qwen: Qwen2.5 VL 72B Instruct. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.
Price History
Current Input / 1M
$0.2500
Current Output / 1M
$0.7500
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
100.0%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.2500 |
| Output (Completion) | $0.7500 |
Estimate monthly spend for Qwen: Qwen2.5 VL 72B Instruct based on your workload.
Estimated Monthly Cost
$15
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.