Context Window
32,768
Input tokens
Full-context input ≈ $0.00
Model Cost Profile
Developer: qwen· Tokenizer: Qwen · Instruct: chatml · Quantization: unknown
Canonical ID: qwen/qwen-2.5-7b-instruct
Pricing updated Apr 24, 2026
Live Pricing
Input: $0.0400
Output: $0.1000
Last synced Apr 24, 2026
Qwen2.5 7B Instruct by qwen is designed for applications requiring extensive context, with a generous context window of 32,768 tokens, making it suitable for complex tasks such as document summarization and conversational AI. Teams leveraging this API model can expect an input cost of $0.04 per million tokens and an output cost of $0.10 per million tokens, which can significantly impact budget planning based on usage patterns. This model is particularly advantageous for organizations that need to process large volumes of text while maintaining high-quality outputs in real-time applications.
Context Window
32,768
Input tokens
Full-context input ≈ $0.00
Max Output
32,768
Completion tokens
Input Price / 1M
$0.0400
Prompt tokens
Output Price / 1M
$0.1000
Completion tokens
Top Benchmark
Pending
No benchmark data yet
Price History
Current Input / 1M
$0.0400
Current Output / 1M
$0.1000
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
99.9%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.0400 |
| Output (Completion) | $0.1000 |
Estimate monthly spend for Qwen: Qwen2.5 7B Instruct based on your workload.
Estimated Monthly Cost
$2.20
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.