Context Window
262,144
Input tokens
Full-context input โ $0.03
Model Cost Profile
Developer: googleยท Tokenizer: Gemma
Pricing updated Apr 4, 2026
Live Pricing
Input: $0.1300
Output: $0.4000
Last synced Apr 4, 2026
Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference โ delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.
Provider Compliance
Context Window
262,144
Input tokens
Full-context input โ $0.03
Max Output
262,144
Completion tokens
Input Price / 1M
$0.1300
Prompt tokens
Output Price / 1M
$0.4000
Completion tokens
Top Benchmark
Pending
No benchmark data yet
Price History
Not enough data yet. Price tracking started recently โ check back in a few days.
Performance History
Not enough data yet. Performance tracking started recently โ check back in a few days.
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.1300 |
| Output (Completion) | $0.4000 |
Estimate monthly spend for Google: Gemma 4 26B A4B based on your workload.
Estimated Monthly Cost
$8.05
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.