Model Cost Profile

Qwen: Qwen3 VL 8B Instruct

Developer: qwen· Tokenizer: Qwen3 · Quantization: fp8

Canonical ID: qwen/qwen3-vl-8b-instruct

Pricing updated Apr 23, 2026

Input rank: #72Output rank: #117

Live Pricing

Input: $0.0800

Output: $0.5000

Visit Qwen ↗HuggingFace ↗View full pricing leaderboard

Last synced Apr 23, 2026 · MMLU score via public benchmark data

Qwen3 VL 8B Instruct, developed by qwen, is designed for applications requiring extensive context, accommodating up to 131,072 tokens for complex tasks such as document summarization and conversational agents. Teams utilizing this API model can expect an input cost of $0.08 per million tokens and an output cost of $0.50 per million tokens, making it suitable for projects with significant data processing needs. Its advanced instruction-following capabilities enable efficient handling of diverse use cases, from natural language understanding to content generation.

👁 Vision🔧 Tool Calling🔌 MCP Compatible📋 Structured Output

Context Window

131,072

Input tokens

Full-context input ≈ $0.01

Max Output

32,768

Completion tokens

Input Price / 1M

$0.0800

Prompt tokens

Output Price / 1M

$0.5000

Completion tokens

Top Benchmark

74.3

MMLU score — highest of MMLU, GPQA, MATH, HumanEval

Quality & Benchmarks

Evaluation scores for Qwen: Qwen3 VL 8B Instruct. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.

Benchmark	Score	Rank	Source
GPQA	58.9	#72 of 125	artificial_analysis
MMLU	74.3	#77 of 127	artificial_analysis

Price History

Qwen: Qwen3 VL 8B Instruct Pricing Trend

Input / 1M tokens0.0%Output / 1M tokens0.0%

Current Input / 1M

$0.0800

Current Output / 1M

$0.5000

Performance History

Qwen: Qwen3 VL 8B Instruct Speed Trend

Tokens/sec (higher is better)Latency (lower is better)

Current TPS

0.00

Current Latency

0ms

Uptime

93.3%

Side-by-Side Pricing Table

Usage Type	Price / 1M Tokens
Input (Prompt)	$0.0800
Output (Completion)	$0.5000

Compare with Qwen: Qwen3 30B A3B Compare with Google: Gemma 3 27B Compare with Meta: Llama 4 Scout

Cost Calculator

Estimate monthly spend for Qwen: Qwen3 VL 8B Instruct based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$8.00

25M input + 12M output tokens

Same Workload on Other Models

Arcee AI: Trinity Large Preview (free)$0.00−$8.00 Free Models Router$0.00−$8.00 Google: Gemma 3 12B (free)$0.00−$8.00 Google: Gemma 3 27B (free)$0.00−$8.00

Cheaper Alternatives to Compare

Quick links for cost-down decisions before production rollout.

Qwen: Qwen3 VL 8B Instruct vs Arcee AI: Trinity Large Preview (free)Qwen: Qwen3 VL 8B Instruct vs Free Models Router Qwen: Qwen3 VL 8B Instruct vs Google: Gemma 3 12B (free)Qwen: Qwen3 VL 8B Instruct vs Google: Gemma 3 27B (free)

Benchmark

Score

Rank

Source

GPQA

58.9

#72 of 125

artificial_analysis

MMLU

74.3

#77 of 127

artificial_analysis

Usage Type

Price / 1M Tokens

Input (Prompt)

$0.0800

Output (Completion)

$0.5000

Cost Calculator

Estimate monthly spend for Qwen: Qwen3 VL 8B Instruct based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$8.00

25M input + 12M output tokens