AI model maker · CN

DeepSeek.

Chinese open-weights lab. DeepSeek-V3 and R1 redefined open-model frontier capability at low cost.

Visit DeepSeek ↗

Models from DeepSeek.

DeepSeek V3

671B
by DeepSeek · DeepSeek · 128,000 ctx

DeepSeek's flagship MoE — 671B total, 37B active, frontier-class.

DeepSeek R1

671B
by DeepSeek · DeepSeek · 128,000 ctx

DeepSeek's reasoning model — RL-trained, frontier-class, MIT-licensed.

DeepSeek R1 Distill Llama 70B

70B
by DeepSeek · DeepSeek · 128,000 ctx

70B Llama distilled from DeepSeek R1's reasoning traces.

DeepSeek R1 Distill Qwen 32B

33B
by DeepSeek · DeepSeek · 128,000 ctx

32B Qwen base distilled from DeepSeek R1.

DeepSeek R1 Distill Qwen 14B

15B
by DeepSeek · DeepSeek · 128,000 ctx

14B distilled R1 — laptop-friendly reasoning.

DeepSeek R1 Distill Qwen 7B

8B
by DeepSeek · DeepSeek · 128,000 ctx

7B distilled R1 — runs on any modern GPU.

DeepSeek R1 Distill Qwen 1.5B

2B
by DeepSeek · DeepSeek · 128,000 ctx

Tiny distilled R1 — phone / browser deployable.

DeepSeek Coder V2 236B

236B
by DeepSeek · DeepSeek Coder · 128,000 ctx

DeepSeek's MoE coding model — 236B total, 21B active.

DeepSeek Coder V2 Lite

16B
by DeepSeek · DeepSeek Coder · 128,000 ctx

16B MoE / 2.4B active — laptop-class coder.

DeepSeek Coder 33B

33B
by DeepSeek · DeepSeek Coder · 16,384 ctx

DeepSeek Coder 6.7B

7B
by DeepSeek · DeepSeek Coder · 16,384 ctx

Deepseek Coder 33B Instruct

33B
by DeepSeek · 16,384 ctx

DeepSeek: DeepSeek V3

text
by DeepSeek · 163,840 ctx

DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous vers...

DeepSeek: DeepSeek V3 0324

text
by DeepSeek · 163,840 ctx

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team...

DeepSeek: DeepSeek V3.1

text
by DeepSeek · 163,840 ctx

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prom...

DeepSeek: DeepSeek V3.1 Terminus

text
by DeepSeek · 163,840 ctx

DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities whi...

DeepSeek: DeepSeek V3.2

text
by DeepSeek · 131,072 ctx

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use pe...

DeepSeek: DeepSeek V3.2 Exp

671B
by DeepSeek · 163,840 ctx

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectu...

DeepSeek: DeepSeek V3.2 Speciale

text
by DeepSeek · 163,840 ctx

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance. It builds on D...

DeepSeek: DeepSeek V4 Flash

text
by DeepSeek · 1,048,576 ctx

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated paramete...

DeepSeek: DeepSeek V4 Pro

text
by DeepSeek · 1,048,576 ctx

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporti...

Deepseek OCR 2

text
by DeepSeek · 8,192 ctx

DeepSeek: R1 0528

671B
by DeepSeek · 163,840 ctx

May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced an...

DeepSeek-V3-0324

671B
by DeepSeek · 163,840 ctx

DeepSeek-V3-0324, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token, an impro...

DeepSeek-V3.1

text
by DeepSeek · 163,840 ctx

DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase l...