Hermes 4 405B
NousResearch · released 2025-08-26 · llama3 license
Hermes 4 405B is Nous Research's hybrid-reasoning fine-tune of Llama-3.1-405B (Aug 2025), with technical-report reasoning-mode scores of MATH-500 96.2, AIME25 78.1, GPQA 70.6 and LiveCodeBench 61.4, plus class-leading low refusal rates.
Key specs
| Type | Local open-weight |
|---|---|
| Parameters | 405.85B total |
| Architecture | llama |
| Context window | 131K tokens |
| Knowledge cutoff | 2024-08-31 |
| Modalities | text |
| Recommended backends | — |
| Minimum viable rig | — |
Benchmark scores
| GPQA Diamond | 70.6% |
|---|---|
| SWE-bench Verified | — |
| AIME | 78.1% |
| MMLU-Pro | 80.6% |
| BFCL v3 (tool use) | — |
| Composite score | 7.66 |
| Community rating | No reviews yet |
VRAM & disk per quantization
| Quant | VRAM | Disk | RAM | Context |
|---|---|---|---|---|
| Q4_K_M | 236.9 GB | 235.4 GB | — | 131K |
Strengths & weaknesses
Strengths: Frontier-level math (MATH-500 96.2, AIME25 78.1) in reasoning mode; Very low refusal / highly steerable; Hybrid reasoning with explicit think mode
Weaknesses: Built on Llama-3.1-405B; non-reasoning mode scores drop sharply; Large/expensive to self-host; restrictive Llama license