DeepSeek V3

deepseek-ai · released 2024-12-26 · DeepSeek Model License license

DeepSeek-V3 (Dec 2024) is the original 671B MoE chat model: 75.9% MMLU-Pro, 59.1% GPQA Diamond, 90.2% MATH-500 and 42.0% SWE-bench Verified per its report. The strongest open model of its time, predating the R1/V3.1 reasoning upgrades.

Key specs

Type	Local open-weight
Parameters	684.53B total · MoE, — active
Architecture	deepseek_v3
Context window	164K tokens
Knowledge cutoff	2024-07-31
Modalities	text
Recommended backends	—
Minimum viable rig	—

Benchmark scores

GPQA Diamond	59.1%
SWE-bench Verified	42%
AIME	39.2%
MMLU-Pro	75.9%
BFCL v3 (tool use)	—
Composite score	5.61
Community rating	No reviews yet

VRAM & disk per quantization

Quant	VRAM	Disk	RAM	Context
Q4_K_M	398.5 GB	397 GB	—	164K

Strengths & weaknesses

Strengths: Landmark efficient open-weights MoE (671B/37B active); Strong math/code for a 2024 non-reasoning model (MATH-500 90.2)

Weaknesses: Original Dec-2024 release, now behind on GPQA/SWE-bench; Non-pure-MIT model license