Context Window
65,536
Input tokens
Full-context input ≈ $0.04
Model Cost Profile
Developer: z-ai· Tokenizer: Other · Quantization: fp8
Canonical ID: z-ai/glm-4.5v
Pricing updated Apr 24, 2026
Live Pricing
Input: $0.6000
Output: $1.80
Last synced Apr 24, 2026 · MMLU score via public benchmark data
Z.ai: GLM 4.5V, developed by z-ai, offers a substantial context window of 65,536 tokens, making it ideal for applications requiring extensive text analysis, such as legal document review or long-form content generation. Teams leveraging this API model can expect input costs of $0.60 per million tokens and output costs of $1.80 per million tokens, which can significantly impact budgeting for large-scale projects. Its advanced capabilities enable efficient handling of complex queries, enhancing productivity in research-heavy environments.
Context Window
65,536
Input tokens
Full-context input ≈ $0.04
Max Output
16,384
Completion tokens
Input Price / 1M
$0.6000
Prompt tokens
Output Price / 1M
$1.80
Completion tokens
Top Benchmark
75.1
MMLU score — highest of MMLU, GPQA, MATH, HumanEval
Evaluation scores for Z.ai: GLM 4.5V. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.
Price History
Current Input / 1M
$0.6000
Current Output / 1M
$1.80
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
93.5%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.6000 |
| Output (Completion) | $1.80 |
| Cache Read | $0.1100 |
Estimate monthly spend for Z.ai: GLM 4.5V based on your workload.
Estimated Monthly Cost
$37
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.