Model Cost Profile

Z.ai: GLM 4.7 Flash

Developer: z-ai· Tokenizer: Other · Quantization: bf16

Canonical ID: z-ai/glm-4.7-flash-20260119

Pricing updated Apr 24, 2026

Input rank: #58Output rank: #109

Live Pricing

Input: $0.0600

Output: $0.4000

HuggingFace ↗View full pricing leaderboard

Last synced Apr 24, 2026

Z.ai: GLM 4.7 Flash is designed for applications requiring extensive context management, accommodating up to 202,752 tokens, making it suitable for complex document analysis and large-scale conversational AI. Teams utilizing this model can expect an input cost of $0.06 per million tokens and an output cost of $0.40 per million tokens, which allows for budget-conscious planning in high-volume scenarios. Its robust capabilities are ideal for industries such as finance, legal, and customer support, where detailed context and nuanced understanding are critical.

💡 Enable prompt caching to save 83% on repeated input tokens ($0.0100/M cached vs $0.0600/M standard).

🔧 Tool Calling🔌 MCP Compatible📋 Structured Output🧠 Reasoning

Context Window

202,752

Input tokens

Full-context input ≈ $0.01

Max Output

—

Not specified

Input Price / 1M

$0.0600

Prompt tokens

Output Price / 1M

$0.4000

Completion tokens

Top Benchmark

Pending

No benchmark data yet

Price History

Z.ai: GLM 4.7 Flash Pricing Trend

Input / 1M tokens0.0%Output / 1M tokens0.0%

Current Input / 1M

$0.0600

Current Output / 1M

$0.4000

Performance History

Z.ai: GLM 4.7 Flash Speed Trend

Tokens/sec (higher is better)Latency (lower is better)

Current TPS

0.00

Current Latency

0ms

Uptime

99.1%

Side-by-Side Pricing Table

Usage Type	Price / 1M Tokens
Input (Prompt)	$0.0600
Output (Completion)	$0.4000
Cache Read	$0.0100

Compare with Z.ai: GLM 4 32B Compare with Amazon: Nova Lite 1.0 Compare with Google: Gemma 3n 4B

Cost Calculator

Estimate monthly spend for Z.ai: GLM 4.7 Flash based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$6.30

25M input + 12M output tokens

Same Workload on Other Models

Baidu: Qianfan-OCR-Fast (free)$0.00−$6.30 Free Models Router$0.00−$6.30 Google: Gemma 3 12B (free)$0.00−$6.30 Google: Gemma 3 27B (free)$0.00−$6.30

Cheaper Alternatives to Compare

Quick links for cost-down decisions before production rollout.

Z.ai: GLM 4.7 Flash vs Baidu: Qianfan-OCR-Fast (free)Z.ai: GLM 4.7 Flash vs Free Models Router Z.ai: GLM 4.7 Flash vs Google: Gemma 3 12B (free)Z.ai: GLM 4.7 Flash vs Google: Gemma 3 27B (free)

Usage Type

Price / 1M Tokens

Input (Prompt)

$0.0600

Output (Completion)

$0.4000

Cache Read

$0.0100

Cost Calculator

Estimate monthly spend for Z.ai: GLM 4.7 Flash based on your workload.

Input tokens / month

01.0B

Output tokens / month

0500M

Estimated Monthly Cost

$6.30

25M input + 12M output tokens