Mistral Nemo Instruct 2407
mistralai · released 2024-07-19 · apache-2.0 license
Mistral-Nemo-Instruct-2407 (July 2024, Apache-2.0) is a 12B 128K-context model trained with NVIDIA. Independent MMLU-Pro is 44.81; vendor multilingual-MMLU averages ~61.3.
Key specs
| Type | Local open-weight |
|---|---|
| Parameters | 12.25B total |
| Architecture | mistral |
| Context window | 131K tokens |
| Knowledge cutoff | 2024-04-30 |
| Modalities | text |
| Recommended backends | — |
| Minimum viable rig | — |
Benchmark scores
| GPQA Diamond | — |
|---|---|
| SWE-bench Verified | — |
| AIME | — |
| MMLU-Pro | 44.81% |
| BFCL v3 (tool use) | — |
| Composite score | 5.03 |
| Community rating | No reviews yet |
VRAM & disk per quantization
| Quant | VRAM | Disk | RAM | Context |
|---|---|---|---|---|
| Q4_K_M | 8.6 GB | 7.1 GB | — | 131K |
Strengths & weaknesses
Strengths: Large 128K context window; Strong multilingual; drop-in replacement for Mistral 7B; Apache-2.0 (with NVIDIA)
Weaknesses: No moderation/guardrails; MMLU-Pro (44.8) below newer small models