Hermes 4 70B
NousResearch · released 2025-08-26 · llama3 license
Hermes 4 70B is Nous Research's hybrid-reasoning fine-tune of Llama-3.1-70B (Aug 2025): MATH-500 95.5, AIME25 67.5, GPQA 66.1, LiveCodeBench 50.5 (reasoning mode). Artificial Analysis scores its non-reasoning mode 13.
Key specs
| Type | Local open-weight |
|---|---|
| Parameters | 70.55B total |
| Architecture | llama |
| Context window | 131K tokens |
| Knowledge cutoff | 2024-08-31 |
| Modalities | text |
| Recommended backends | — |
| Minimum viable rig | — |
Benchmark scores
| GPQA Diamond | 66.1% |
|---|---|
| SWE-bench Verified | — |
| AIME | 67.5% |
| MMLU-Pro | 80.7% |
| BFCL v3 (tool use) | — |
| Composite score | 7.2 |
| Community rating | No reviews yet |
VRAM & disk per quantization
| Quant | VRAM | Disk | RAM | Context |
|---|---|---|---|---|
| Q4_K_M | 42.4 GB | 40.9 GB | — | 131K |
Strengths & weaknesses
Strengths: Strong reasoning-mode math (MATH-500 95.5) for a 70B dense model; High steerability / low refusals; Faster and cheaper than average for its size
Weaknesses: Below-average overall intelligence vs same-size peers (AA index 13); Restrictive Llama license; 128K context