Model Cost Profile

Z.ai: GLM 4.5 Air

Developer: z-ai· Tokenizer: Other · Quantization: bf16

Canonical ID: z-ai/glm-4.5-air

Pricing updated Apr 24, 2026

Input rank: #106Output rank: #161

Live Pricing

Input: $0.1300

Output: $0.8500

HuggingFace ↗View full pricing leaderboard

Last synced Apr 24, 2026 · MMLU score via public benchmark data

Z.ai: GLM 4.5 Air offers a substantial context window of 131072 tokens, making it ideal for applications requiring extensive document analysis or multi-turn conversational agents. With an input price of $0.13 per 1M tokens and an output price of $0.85 per 1M tokens, teams can effectively manage costs while leveraging this model for complex data processing tasks. Its design caters to industries such as customer support, content generation, and research, where large context handling is essential for delivering accurate and coherent results.

💡 Enable prompt caching to save 81% on repeated input tokens ($0.0250/M cached vs $0.1300/M standard).

🔧 Tool Calling🔌 MCP Compatible🧠 Reasoning

Context Window

131,072

Input tokens

Full-context input ≈ $0.02

Max Output

98,304

Completion tokens

Input Price / 1M

$0.1300

Prompt tokens

Output Price / 1M

$0.8500

Completion tokens

Top Benchmark

81.5

MMLU score — highest of MMLU, GPQA, MATH, HumanEval

Quality & Benchmarks

Evaluation scores for Z.ai: GLM 4.5 Air. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.

Benchmark	Score	Rank	Source
GPQA	73.3	#42 of 125	artificial_analysis
MMLU	81.5	#36 of 127	artificial_analysis

Price History

Z.ai: GLM 4.5 Air Pricing Trend

Input / 1M tokens0.0%Output / 1M tokens0.0%

Current Input / 1M

$0.1300

Current Output / 1M

$0.8500

Performance History

Z.ai: GLM 4.5 Air Speed Trend

Tokens/sec (higher is better)Latency (lower is better)

Current TPS

0.00

Current Latency

0ms

Uptime

100.0%

Side-by-Side Pricing Table

Usage Type	Price / 1M Tokens
Input (Prompt)	$0.1300
Output (Completion)	$0.8500
Cache Read	$0.0250

Compare with Z.ai: GLM 4 32B Compare with Google: Gemma 4 31B Compare with Nous: Hermes 4 70B

Cost Calculator

Estimate monthly spend for Z.ai: GLM 4.5 Air based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$13

25M input + 12M output tokens

Same Workload on Other Models

Baidu: Qianfan-OCR-Fast (free)$0.00−$13 Free Models Router$0.00−$13 Google: Gemma 3 12B (free)$0.00−$13 Google: Gemma 3 27B (free)$0.00−$13

Cheaper Alternatives to Compare

Quick links for cost-down decisions before production rollout.

Z.ai: GLM 4.5 Air vs Baidu: Qianfan-OCR-Fast (free)Z.ai: GLM 4.5 Air vs Free Models Router Z.ai: GLM 4.5 Air vs Google: Gemma 3 12B (free)Z.ai: GLM 4.5 Air vs Google: Gemma 3 27B (free)

Benchmark

Score

Rank

Source

GPQA

73.3

#42 of 125

artificial_analysis

MMLU

81.5

#36 of 127

artificial_analysis

Usage Type

Price / 1M Tokens

Input (Prompt)

$0.1300

Output (Completion)

$0.8500

Cache Read

$0.0250

Cost Calculator

Estimate monthly spend for Z.ai: GLM 4.5 Air based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$13

25M input + 12M output tokens