Context Window
131,072
Input tokens
Full-context input ≈ $0.13
Model Cost Profile
Developer: nousresearch· Tokenizer: Llama3 · Instruct: chatml · Quantization: fp8
Canonical ID: nousresearch/hermes-3-llama-3.1-405b
Pricing updated Apr 23, 2026
Live Pricing
Input: $1.00
Output: $1.00
Last synced Apr 23, 2026 · MMLU score via public benchmark data
Nous: Hermes 3 405B Instruct, developed by nousresearch, offers a substantial context window of 131072 tokens, making it ideal for applications requiring extensive data processing and nuanced understanding, such as legal document analysis or comprehensive content generation. With an input and output pricing model set at $1.00 per million tokens, teams can effectively budget for high-volume tasks while maintaining cost efficiency in their API usage. This model is particularly advantageous for organizations that need to handle large datasets or complex queries without incurring prohibitive costs.
Context Window
131,072
Input tokens
Full-context input ≈ $0.13
Max Output
16,384
Completion tokens
Input Price / 1M
$1.00
Prompt tokens
Output Price / 1M
$1.00
Completion tokens
Top Benchmark
82.9
MMLU score — highest of MMLU, GPQA, MATH, HumanEval
Evaluation scores for Nous: Hermes 3 405B Instruct. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.
Price History
Current Input / 1M
$1.00
Current Output / 1M
$1.00
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
100.0%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $1.00 |
| Output (Completion) | $1.00 |
Estimate monthly spend for Nous: Hermes 3 405B Instruct based on your workload.
Estimated Monthly Cost
$37
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.