Model Cost Profile

Google: Gemma 3 12B

Developer: google

Pricing updated Mar 11, 2026

Input rank: #45Output rank: #48

Live Pricing

Input: $0.0400

Output: $0.1300

Pricing via OpenRouter API ยท Last synced Mar 11, 2026

Google's Gemma 3 12B model offers a substantial context window of 131,072 tokens, making it ideal for applications requiring extensive text comprehension, such as legal document analysis and long-form content generation. With an input price of $0.04 per million tokens and an output price of $0.13 per million tokens, teams can effectively manage costs while leveraging the model for high-volume tasks. This pricing structure allows organizations to scale their usage based on specific project needs, ensuring budget-friendly access to advanced AI capabilities.

๐Ÿ‘ Vision๐Ÿ“‹ Structured Output

Context Window

131,072

Tokens

Input Price / 1M

$0.0400

Prompt tokens

Output Price / 1M

$0.1300

Completion tokens

Intelligence (MMLU)

Benchmark Pending

Massive Multitask Language Understanding

Price History

Google: Gemma 3 12B Pricing Trend

Input / 1M tokens0.0%Output / 1M tokens0.0%
Mar 7 โ€” Mar 11
$0.0400$0.0850$0.1300Mar 7Mar 8Mar 9Mar 10Mar 11

Current Input / 1M

$0.0400

Current Output / 1M

$0.1300

Cheaper Alternatives to Compare

Quick links for cost-down decisions before production rollout.

FAQ

Common pricing and benchmark questions for Google: Gemma 3 12B.

How much does Google: Gemma 3 12B cost per 1M input tokens?

Google: Gemma 3 12B input pricing is $0.0400 per 1M tokens based on the latest synced provider data.

How much does Google: Gemma 3 12B cost per 1M output tokens?

Google: Gemma 3 12B output pricing is $0.1300 per 1M tokens based on the latest synced provider data.

What context window does Google: Gemma 3 12B support?

Google: Gemma 3 12B supports a context window of 131,072 tokens.

How can I compare Google: Gemma 3 12B with cheaper alternatives?

Use the comparison links on this page to open direct model-vs-model pricing and benchmark pages, then evaluate monthly spend projections for your workload.