DeepSeek R1 0528
deepseek-ai · released 2025-05-28 · mit license
DeepSeek-R1-0528 (May 2025) is a reasoning-focused open-weights MoE that pushed AIME 2025 to 87.5% and GPQA Diamond to 81.0% per its model card, a top open-weights lab at the time.
Key specs
| Type | Local open-weight |
|---|---|
| Parameters | 684.53B total · MoE, — active |
| Architecture | deepseek_v3 |
| Context window | 164K tokens |
| Knowledge cutoff | 2025-03-31 |
| Modalities | text |
| Recommended backends | — |
| Minimum viable rig | — |
Benchmark scores
| GPQA Diamond | 81% |
|---|---|
| SWE-bench Verified | 57.6% |
| AIME | 87.5% |
| MMLU-Pro | 85% |
| BFCL v3 (tool use) | — |
| Composite score | 6.42 |
| Community rating | No reviews yet |
VRAM & disk per quantization
| Quant | VRAM | Disk | RAM | Context |
|---|---|---|---|---|
| Q4_K_M | 398.5 GB | 397 GB | — | 164K |
Strengths & weaknesses
Strengths: Top-tier open-weights reasoning at release (AIME 87.5, GPQA 81.0); Big jump in deep reasoning vs original R1; MIT, distillation-friendly
Weaknesses: High token usage / latency from long chain-of-thought; Surpassed by V3.1/V3.2 on agentic coding