DeepSeek.
Chinese open-weights lab. DeepSeek-V3 and R1 redefined open-model frontier capability at low cost.
Models from DeepSeek.
DeepSeek V3
DeepSeek's flagship MoE — 671B total, 37B active, frontier-class.
DeepSeek R1
DeepSeek's reasoning model — RL-trained, frontier-class, MIT-licensed.
DeepSeek R1 Distill Llama 70B
70B Llama distilled from DeepSeek R1's reasoning traces.
DeepSeek R1 Distill Qwen 32B
32B Qwen base distilled from DeepSeek R1.
DeepSeek R1 Distill Qwen 14B
14B distilled R1 — laptop-friendly reasoning.
DeepSeek R1 Distill Qwen 7B
7B distilled R1 — runs on any modern GPU.
DeepSeek R1 Distill Qwen 1.5B
Tiny distilled R1 — phone / browser deployable.
DeepSeek Coder V2 236B
DeepSeek's MoE coding model — 236B total, 21B active.
DeepSeek Coder V2 Lite
16B MoE / 2.4B active — laptop-class coder.
DeepSeek Coder 33B
DeepSeek Coder 6.7B
Deepseek Coder 33B Instruct
DeepSeek: DeepSeek V3
DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous vers...
DeepSeek: DeepSeek V3 0324
DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team...
DeepSeek: DeepSeek V3.1
DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prom...
DeepSeek: DeepSeek V3.1 Terminus
DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities whi...
DeepSeek: DeepSeek V3.2
DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use pe...
DeepSeek: DeepSeek V3.2 Exp
DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectu...
DeepSeek: DeepSeek V3.2 Speciale
DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance. It builds on D...
DeepSeek: DeepSeek V4 Flash
DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated paramete...
DeepSeek: DeepSeek V4 Pro
DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporti...
Deepseek OCR 2
DeepSeek: R1 0528
May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced an...
DeepSeek-V3-0324
DeepSeek-V3-0324, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token, an impro...
DeepSeek-V3.1
DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase l...