Context Window
131,072
Input tokens
Full-context input ≈ $0.04
Model Cost Profile
Developer: z-ai· Tokenizer: Other · Quantization: fp8
Canonical ID: z-ai/glm-4.6-20251208
Pricing updated Apr 24, 2026
Live Pricing
Input: $0.3000
Output: $0.9000
Last synced Apr 24, 2026 · MMLU score via public benchmark data
Z.ai: GLM 4.6V offers an extensive context window of 131072 tokens, making it suitable for applications requiring deep contextual understanding, such as legal document analysis and long-form content generation. Teams leveraging this API model can expect input costs of $0.30 per million tokens and output costs of $0.90 per million tokens, allowing for flexible budgeting based on usage patterns. This pricing structure is particularly advantageous for businesses that handle large volumes of text, enabling them to optimize costs while maximizing the model's capabilities.
Context Window
131,072
Input tokens
Full-context input ≈ $0.04
Max Output
131,072
Completion tokens
Input Price / 1M
$0.3000
Prompt tokens
Output Price / 1M
$0.9000
Completion tokens
Top Benchmark
79.9
MMLU score — highest of MMLU, GPQA, MATH, HumanEval
Evaluation scores for Z.ai: GLM 4.6V. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.
Price History
Current Input / 1M
$0.3000
Current Output / 1M
$0.9000
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
95.0%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.3000 |
| Output (Completion) | $0.9000 |
Estimate monthly spend for Z.ai: GLM 4.6V based on your workload.
Estimated Monthly Cost
$18
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.