Model Cost Profile

Qwen: Qwen3 VL 32B Instruct

Developer: qwen· Tokenizer: Qwen · Quantization: unknown

Canonical ID: qwen/qwen3-vl-32b-instruct

Pricing updated Apr 24, 2026

Input rank: #98Output rank: #112

Live Pricing

Input: $0.1040

Output: $0.4160

Visit Qwen ↗HuggingFace ↗View full pricing leaderboard

Last synced Apr 24, 2026 · MMLU score via public benchmark data

Qwen3 VL 32B Instruct by qwen is designed for applications requiring extensive context handling, with a remarkable context window of 131072 tokens, making it ideal for complex tasks like document summarization and conversational AI. Teams leveraging this API model can expect a cost-effective input price of $0.10 per million tokens, while output pricing stands at $0.42 per million tokens, allowing for scalable budgeting based on usage needs. This model is particularly suited for enterprises that demand high throughput and nuanced understanding in their AI-driven solutions.

👁 Vision🔧 Tool Calling🔌 MCP Compatible📋 Structured Output

Context Window

131,072

Input tokens

Full-context input ≈ $0.01

Max Output

32,768

Completion tokens

Input Price / 1M

$0.1040

Prompt tokens

Output Price / 1M

$0.4160

Completion tokens

Top Benchmark

79.8

MMLU score — highest of MMLU, GPQA, MATH, HumanEval

Quality & Benchmarks

Evaluation scores for Qwen: Qwen3 VL 32B Instruct. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.

Benchmark	Score	Rank	Source
GPQA	66.8	#52 of 125	artificial_analysis
MMLU	79.8	#46 of 127	artificial_analysis

Price History

Qwen: Qwen3 VL 32B Instruct Pricing Trend

Input / 1M tokens0.0%Output / 1M tokens0.0%

Current Input / 1M

$0.1040

Current Output / 1M

$0.4160

Performance History

Qwen: Qwen3 VL 32B Instruct Speed Trend

Tokens/sec (higher is better)Latency (lower is better)

Current TPS

0.00

Current Latency

0ms

Uptime

100.0%

Side-by-Side Pricing Table

Usage Type	Price / 1M Tokens
Input (Prompt)	$0.1040
Output (Completion)	$0.4160

Compare with Qwen: Qwen3.5-9B Compare with ByteDance Seed: Seed-2.0-Mini Compare with ByteDance: UI-TARS 7B

Cost Calculator

Estimate monthly spend for Qwen: Qwen3 VL 32B Instruct based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$7.59

25M input + 12M output tokens

Same Workload on Other Models

Baidu: Qianfan-OCR-Fast (free)$0.00−$7.59 Free Models Router$0.00−$7.59 Google: Gemma 3 12B (free)$0.00−$7.59 Google: Gemma 3 27B (free)$0.00−$7.59

Cheaper Alternatives to Compare

Quick links for cost-down decisions before production rollout.

Qwen: Qwen3 VL 32B Instruct vs Baidu: Qianfan-OCR-Fast (free)Qwen: Qwen3 VL 32B Instruct vs Free Models Router Qwen: Qwen3 VL 32B Instruct vs Google: Gemma 3 12B (free)Qwen: Qwen3 VL 32B Instruct vs Google: Gemma 3 27B (free)

Benchmark

Score

Rank

Source

GPQA

66.8

#52 of 125

artificial_analysis

MMLU

79.8

#46 of 127

artificial_analysis

Usage Type

Price / 1M Tokens

Input (Prompt)

$0.1040

Output (Completion)

$0.4160

Cost Calculator

Estimate monthly spend for Qwen: Qwen3 VL 32B Instruct based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$7.59

25M input + 12M output tokens