Mistral Nemo Instruct 2407

mistralai · released 2024-07-19 · apache-2.0 license

Mistral-Nemo-Instruct-2407 (July 2024, Apache-2.0) is a 12B 128K-context model trained with NVIDIA. Independent MMLU-Pro is 44.81; vendor multilingual-MMLU averages ~61.3.

Key specs

Type	Local open-weight
Parameters	12.25B total
Architecture	mistral
Context window	131K tokens
Knowledge cutoff	2024-04-30
Modalities	text
Recommended backends	—
Minimum viable rig	—

Benchmark scores

GPQA Diamond	—
SWE-bench Verified	—
AIME	—
MMLU-Pro	44.81%
BFCL v3 (tool use)	—
Composite score	5.03
Community rating	No reviews yet

VRAM & disk per quantization

Quant	VRAM	Disk	RAM	Context
Q4_K_M	8.6 GB	7.1 GB	—	131K

Strengths & weaknesses

Strengths: Large 128K context window; Strong multilingual; drop-in replacement for Mistral 7B; Apache-2.0 (with NVIDIA)

Weaknesses: No moderation/guardrails; MMLU-Pro (44.8) below newer small models