Phi-3.5-MoE-instruct
Microsoft · released 2024-08-01 · MIT license
Phi-3.5-MoE-instruct is Microsoft's Aug-2024 MoE (16x3.8B, 6.6B active, MIT, 128K) and the strongest Phi-3.5 model, rivaling larger models on reasoning and long-context retrieval (MMLU-Pro 54.3, RULER 87.1).
Key specs
| Type | Local open-weight |
|---|---|
| Parameters | 41.87B total |
| Architecture | phimoe |
| Context window | 131K tokens |
| Knowledge cutoff | 2023-10-01 |
| Modalities | text |
| Recommended backends | — |
| Minimum viable rig | — |
Benchmark scores
| GPQA Diamond | — |
|---|---|
| SWE-bench Verified | — |
| AIME | — |
| MMLU-Pro | 54.3% |
| BFCL v3 (tool use) | — |
| Composite score | 5.95 |
| Community rating | No reviews yet |
VRAM & disk per quantization
| Quant | VRAM | Disk | RAM | Context |
|---|---|---|---|---|
| Q4_K_M | 25.8 GB | 24.3 GB | — | 131K |
Strengths & weaknesses
Strengths: Strong reasoning/knowledge for 6.6B active (MMLU 78.9); Best-in-class long-context retrieval (RULER 87.1), 128K; MIT MoE (16x3.8B)
Weaknesses: Higher memory footprint than dense small models; Limited factual recall typical of Phi family