Model Cost Profile

Z.ai: GLM 4.6

Developer: z-ai

Pricing updated Mar 11, 2026

Input rank: #185Output rank: #211

Live Pricing

Input: $0.3900

Output: $1.90

Pricing via OpenRouter API ยท Last synced Mar 11, 2026

Z.ai: GLM 4.6, developed by z-ai, offers a substantial context window of 202,752 tokens, making it ideal for applications requiring extensive text comprehension, such as legal document analysis and long-form content generation. Teams leveraging this API model can expect an input cost of $0.35 per million tokens and an output cost of $1.71 per million tokens, which can influence budgeting for projects with high token usage. This pricing structure allows organizations to scale their usage efficiently while managing costs associated with large-scale natural language processing tasks.

๐Ÿ”ง Tool Calling๐Ÿ“‹ Structured Output๐Ÿง  Reasoning

Context Window

204,800

Tokens

Input Price / 1M

$0.3900

Prompt tokens

Output Price / 1M

$1.90

Completion tokens

Intelligence (MMLU)

Benchmark Pending

Massive Multitask Language Understanding

Price History

Z.ai: GLM 4.6 Pricing Trend

Input / 1M tokens0.0%Output / 1M tokens0.0%
Mar 7 โ€” Mar 11
$0.3900$1.15$1.90Mar 7Mar 8Mar 9Mar 10Mar 11

Current Input / 1M

$0.3900

Current Output / 1M

$1.90

Cheaper Alternatives to Compare

Quick links for cost-down decisions before production rollout.

FAQ

Common pricing and benchmark questions for Z.ai: GLM 4.6.

How much does Z.ai: GLM 4.6 cost per 1M input tokens?

Z.ai: GLM 4.6 input pricing is $0.3900 per 1M tokens based on the latest synced provider data.

How much does Z.ai: GLM 4.6 cost per 1M output tokens?

Z.ai: GLM 4.6 output pricing is $1.90 per 1M tokens based on the latest synced provider data.

What context window does Z.ai: GLM 4.6 support?

Z.ai: GLM 4.6 supports a context window of 204,800 tokens.

How can I compare Z.ai: GLM 4.6 with cheaper alternatives?

Use the comparison links on this page to open direct model-vs-model pricing and benchmark pages, then evaluate monthly spend projections for your workload.