DeepSeek R1 Distill Qwen 32B

deepseek-ai · released 2025-01-20 · mit license

DeepSeek-R1-Distill-Qwen-32B (Jan 2025) distills R1 reasoning into Qwen2.5-32B: AIME24 72.6, MATH-500 94.3, GPQA Diamond 62.1, LiveCodeBench 57.2, beating o1-mini on several. MIT.

Key specs

Type	Local open-weight
Parameters	32.76B total
Architecture	qwen2
Context window	131K tokens
Knowledge cutoff	—
Modalities	text
Recommended backends	—
Minimum viable rig	—

Benchmark scores

GPQA Diamond	62.1%
SWE-bench Verified	—
AIME	72.6%
MMLU-Pro	—
BFCL v3 (tool use)	—
Composite score	6.63
Community rating	No reviews yet

VRAM & disk per quantization

Quant	VRAM	Disk	RAM	Context
Q4_K_M	20.5 GB	19 GB	—	131K

Strengths & weaknesses

Strengths: Strong math reasoning (AIME24 72.6 pass@1, MATH-500 94.3); Beats o1-mini on several benchmarks at dense 32B; MIT model license

Weaknesses: Distilled reasoning; weaker general chat breadth than full R1; Long CoT increases latency/cost