Context Window
131,072
Input tokens
Full-context input ≈ $0.02
Model Cost Profile
Developer: z-ai· Tokenizer: Other · Quantization: bf16
Canonical ID: z-ai/glm-4.5-air
Pricing updated Apr 24, 2026
Live Pricing
Input: $0.1300
Output: $0.8500
Last synced Apr 24, 2026 · MMLU score via public benchmark data
Z.ai: GLM 4.5 Air offers a substantial context window of 131072 tokens, making it ideal for applications requiring extensive document analysis or multi-turn conversational agents. With an input price of $0.13 per 1M tokens and an output price of $0.85 per 1M tokens, teams can effectively manage costs while leveraging this model for complex data processing tasks. Its design caters to industries such as customer support, content generation, and research, where large context handling is essential for delivering accurate and coherent results.
Context Window
131,072
Input tokens
Full-context input ≈ $0.02
Max Output
98,304
Completion tokens
Input Price / 1M
$0.1300
Prompt tokens
Output Price / 1M
$0.8500
Completion tokens
Top Benchmark
81.5
MMLU score — highest of MMLU, GPQA, MATH, HumanEval
Evaluation scores for Z.ai: GLM 4.5 Air. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.
Price History
Current Input / 1M
$0.1300
Current Output / 1M
$0.8500
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
100.0%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.1300 |
| Output (Completion) | $0.8500 |
| Cache Read | $0.0250 |
Estimate monthly spend for Z.ai: GLM 4.5 Air based on your workload.
Estimated Monthly Cost
$13
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.