Context Window
202,752
Input tokens
Full-context input โ $0.01
Model Cost Profile
Developer: z-aiยท Tokenizer: Other ยท Quantization: bf16
Canonical ID: z-ai/glm-4.7-flash-20260119
Pricing updated Apr 24, 2026
Live Pricing
Input: $0.0600
Output: $0.4000
Last synced Apr 24, 2026
Z.ai: GLM 4.7 Flash is designed for applications requiring extensive context management, accommodating up to 202,752 tokens, making it suitable for complex document analysis and large-scale conversational AI. Teams utilizing this model can expect an input cost of $0.06 per million tokens and an output cost of $0.40 per million tokens, which allows for budget-conscious planning in high-volume scenarios. Its robust capabilities are ideal for industries such as finance, legal, and customer support, where detailed context and nuanced understanding are critical.
Context Window
202,752
Input tokens
Full-context input โ $0.01
Max Output
โ
Not specified
Input Price / 1M
$0.0600
Prompt tokens
Output Price / 1M
$0.4000
Completion tokens
Top Benchmark
Pending
No benchmark data yet
Price History
Current Input / 1M
$0.0600
Current Output / 1M
$0.4000
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
99.1%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.0600 |
| Output (Completion) | $0.4000 |
| Cache Read | $0.0100 |
Estimate monthly spend for Z.ai: GLM 4.7 Flash based on your workload.
Estimated Monthly Cost
$6.30
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.