Context Window
65,536
Input tokens
Full-context input ≈ $0.01
Model Cost Profile
Developer: rekaai· Tokenizer: Other
Pricing updated Apr 1, 2026
Live Pricing
Input: $0.1000
Output: $0.2000
Last synced Apr 1, 2026 · MMLU score via public benchmark data
Reka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billion parameters, developed by Reka. It excels at general chat, coding tasks, instruction-following, and function calling. Featuring a 32K context length and optimized through reinforcement learning (RLOO), it provides competitive performance comparable to proprietary models within a smaller parameter footprint. Ideal for low-latency, local, or on-device deployments, Reka Flash 3 is compact, supports efficient quantization (down to 11GB at 4-bit precision), and employs explicit reasoning tags ("<reasoning>") to indicate its internal thought process. Reka Flash 3 is primarily an English model with limited multilingual understanding capabilities. The model weights are released under the Apache 2.0 license.
Context Window
65,536
Input tokens
Full-context input ≈ $0.01
Max Output
65,536
Completion tokens
Input Price / 1M
$0.1000
Prompt tokens
Output Price / 1M
$0.2000
Completion tokens
Top Benchmark
66.9
MMLU score — highest of MMLU, GPQA, MATH, HumanEval
Evaluation scores for Reka: Flash 3. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.
Price History
Not enough data yet. Price tracking started recently — check back in a few days.
Performance History
Not enough data yet. Performance tracking started recently — check back in a few days.
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.1000 |
| Output (Completion) | $0.2000 |
Estimate monthly spend for Reka: Flash 3 based on your workload.
Estimated Monthly Cost
$4.90
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.