Context Window
131,072
Input tokens
Full-context input โ $0.02
Model Cost Profile
Developer: nousresearchยท Tokenizer: Llama3 ยท Quantization: fp8
Canonical ID: nousresearch/hermes-4-70b
Pricing updated Apr 22, 2026
Live Pricing
Input: $0.1300
Output: $0.4000
Last synced Apr 22, 2026
Nous: Hermes 4 70B, developed by nousresearch, offers a substantial context window of 131,072 tokens, making it ideal for applications requiring extensive document analysis or multi-turn conversations. Teams leveraging this API model can expect an input cost of $0.13 per million tokens and an output cost of $0.40 per million tokens, which can significantly impact budget considerations for high-volume usage scenarios. This model is particularly suited for industries such as legal, healthcare, and customer support, where detailed context and nuanced understanding are essential.
Context Window
131,072
Input tokens
Full-context input โ $0.02
Max Output
โ
Not specified
Input Price / 1M
$0.1300
Prompt tokens
Output Price / 1M
$0.4000
Completion tokens
Top Benchmark
Pending
No benchmark data yet
Price History
Current Input / 1M
$0.1300
Current Output / 1M
$0.4000
Performance History
Current TPS
0.00
Current Latency
0ms
Uptime
100.0%
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $0.1300 |
| Output (Completion) | $0.4000 |
Estimate monthly spend for Nous: Hermes 4 70B based on your workload.
Estimated Monthly Cost
$8.05
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.