Context Window
131,072
Input tokens
Full-context input ≈ $0.04
Model Cost Profile
Developer: nousresearch· Tokenizer: Llama3 · Instruct: chatml · Quantization: fp8
Canonical ID: nousresearch/hermes-3-llama-3.1-70b
Pricing updated Apr 24, 2026
Live Pricing
Input: $0.3000
Output: $0.3000
Last synced Apr 24, 2026 · MMLU score via public benchmark data
Nous: Hermes 3 70B Instruct, developed by nousresearch, features an extensive context window of 65,536 tokens, making it suitable for complex applications such as long-form content generation, detailed data analysis, and multi-turn conversational AI. With an input and output pricing of $0.30 per 1 million tokens, teams can effectively manage costs while leveraging the model for both high-volume processing and intricate tasks. This model's capabilities are ideal for businesses needing to handle large datasets or provide nuanced responses in customer service and interactive applications.
Context Window
131,072
Input tokens
Full-context input ≈ $0.04
Max Output
—
Not specified
Input Price / 1M
$0.3000
Prompt tokens
Output Price / 1M
$0.3000
Completion tokens
Top Benchmark
66.4
MMLU score — highest of MMLU, GPQA, MATH, HumanEval
Evaluation scores for Nous: Hermes 3 70B Instruct. The “Top Benchmark” shown above is the highest score across MMLU, GPQA, MATH & HumanEval.
Price History
Current Input / 1M
$0.3000
Current Output / 1M
$0.3000
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
100.0%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.3000 |
| Output (Completion) | $0.3000 |
Estimate monthly spend for Nous: Hermes 3 70B Instruct based on your workload.
Estimated Monthly Cost
$11
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.