Model Cost Profile

Z.ai: GLM 4.5V

Developer: z-ai· Tokenizer: Other · Quantization: fp8

Canonical ID: z-ai/glm-4.5v

Pricing updated Apr 24, 2026

Input rank: #221Output rank: #209

Live Pricing

Input: $0.6000

Output: $1.80

HuggingFace ↗View full pricing leaderboard

Last synced Apr 24, 2026 · MMLU score via public benchmark data

Z.ai: GLM 4.5V, developed by z-ai, offers a substantial context window of 65,536 tokens, making it ideal for applications requiring extensive text analysis, such as legal document review or long-form content generation. Teams leveraging this API model can expect input costs of $0.60 per million tokens and output costs of $1.80 per million tokens, which can significantly impact budgeting for large-scale projects. Its advanced capabilities enable efficient handling of complex queries, enhancing productivity in research-heavy environments.

💡 Enable prompt caching to save 82% on repeated input tokens ($0.1100/M cached vs $0.6000/M standard).

👁 Vision🔧 Tool Calling🔌 MCP Compatible🧠 Reasoning💾 Implicit Caching

Context Window

65,536

Input tokens

Full-context input ≈ $0.04

Max Output

16,384

Completion tokens

Input Price / 1M

$0.6000

Prompt tokens

Output Price / 1M

$1.80

Completion tokens

Top Benchmark

75.1

MMLU score — highest of MMLU, GPQA, MATH, HumanEval

Quality & Benchmarks

Evaluation scores for Z.ai: GLM 4.5V. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.

Benchmark	Score	Rank	Source
GPQA	57.3	#75 of 125	artificial_analysis
MMLU	75.1	#69 of 127	artificial_analysis

Price History

Z.ai: GLM 4.5V Pricing Trend

Input / 1M tokens0.0%Output / 1M tokens0.0%

Current Input / 1M

$0.6000

Current Output / 1M

$1.80

Performance History

Z.ai: GLM 4.5V Speed Trend

Tokens/sec (higher is better)Latency (lower is better)

Current TPS

0.00

Current Latency

0ms

Uptime

93.5%

Side-by-Side Pricing Table

Usage Type	Price / 1M Tokens
Input (Prompt)	$0.6000
Output (Completion)	$1.80
Cache Read	$0.1100

Compare with Z.ai: GLM 4.5 Compare with MoonshotAI: Kimi K2 Thinking Compare with OpenAI: GPT Audio Mini

Cost Calculator

Estimate monthly spend for Z.ai: GLM 4.5V based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$37

25M input + 12M output tokens

Same Workload on Other Models

Baidu: Qianfan-OCR-Fast (free)$0.00−$37 Free Models Router$0.00−$37 Google: Gemma 3 12B (free)$0.00−$37 Google: Gemma 3 27B (free)$0.00−$37

Cheaper Alternatives to Compare

Quick links for cost-down decisions before production rollout.

Z.ai: GLM 4.5V vs Baidu: Qianfan-OCR-Fast (free)Z.ai: GLM 4.5V vs Free Models Router Z.ai: GLM 4.5V vs Google: Gemma 3 12B (free)Z.ai: GLM 4.5V vs Google: Gemma 3 27B (free)

Benchmark

Score

Rank

Source

GPQA

57.3

#75 of 125

artificial_analysis

MMLU

75.1

#69 of 127

artificial_analysis

Usage Type

Price / 1M Tokens

Input (Prompt)

$0.6000

Output (Completion)

$1.80

Cache Read

$0.1100

Cost Calculator

Estimate monthly spend for Z.ai: GLM 4.5V based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$37

25M input + 12M output tokens