Reka Flash 3
RekaAI · released 2025-03-12 · apache-2.0 license
Reka Flash 3 is a 21B open reasoning model (March 2025, Apache-2.0) positioned against o1-mini for low-latency use. Independent evals: GPQA Diamond 53.5, AIME'25 25.3, LiveCodeBench 44.8; vendor MMLU-Pro 65.0.
Key specs
| Type | Local open-weight |
|---|---|
| Parameters | 20.91B total |
| Architecture | llama |
| Context window | 66K tokens |
| Knowledge cutoff | 2025-01-31 |
| Modalities | text |
| Recommended backends | — |
| Minimum viable rig | — |
Benchmark scores
| GPQA Diamond | 53.53% |
|---|---|
| SWE-bench Verified | — |
| AIME | 25.33% |
| MMLU-Pro | 65% |
| BFCL v3 (tool use) | — |
| Composite score | 4.88 |
| Community rating | No reviews yet |
VRAM & disk per quantization
| Quant | VRAM | Disk | RAM | Context |
|---|---|---|---|---|
| Q4_K_M | 13.6 GB | 12.1 GB | — | 66K |
Strengths & weaknesses
Strengths: Competitive with o1-mini at 21B; Efficient / quantization-friendly (~11GB at 4-bit) for on-device; Apache-2.0
Weaknesses: Weak on knowledge-intensive tasks (vendor notes MMLU-Pro 65.0); Essentially English-only