Context Window
4,096
Tokens
Model Cost Profile
Developer: eleutherai
Pricing updated Mar 11, 2026
EleutherAI's Llemma 7b is a versatile language model with a context window of 4096 tokens, making it suitable for applications such as chatbots, content generation, and data analysis. Teams leveraging this API model can expect an input price of $0.80 per 1 million tokens and an output price of $1.20 per 1 million tokens, which can impact budgeting depending on the volume of data processed. Its efficient architecture allows for scalable integration into various workflows, enhancing productivity while managing costs effectively.
Context Window
4,096
Tokens
Input Price / 1M
$0.8000
Prompt tokens
Output Price / 1M
$1.20
Completion tokens
Intelligence (MMLU)
Benchmark Pending
Massive Multitask Language Understanding
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.8000 |
| Output (Completion) | $1.20 |
Price History
Current Input / 1M
$0.8000
Current Output / 1M
$1.20
Estimate monthly spend for EleutherAI: Llemma 7b based on your workload.
Estimated Monthly Cost
$34
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.
Common pricing and benchmark questions for EleutherAI: Llemma 7b.
EleutherAI: Llemma 7b input pricing is $0.8000 per 1M tokens based on the latest synced provider data.
EleutherAI: Llemma 7b output pricing is $1.20 per 1M tokens based on the latest synced provider data.
EleutherAI: Llemma 7b supports a context window of 4,096 tokens.
Use the comparison links on this page to open direct model-vs-model pricing and benchmark pages, then evaluate monthly spend projections for your workload.