DeepSeek V4 Launches: 1 Trillion Parameters, 1M Context, Open-Weight
DeepSeek's Most Ambitious Release Yet
Chinese AI startup DeepSeek released DeepSeek V4 on February 17, coinciding with the Lunar New Year. The model features 1 trillion total parameters, a 1-million-token context window, and three architectural innovations: mHC (Manifold-Constrained Hyper-Connections), Engram conditional memory, and Sparse Attention. It is released as an open-weight model.
Technical Highlights
- mHC architecture: Addresses fundamental Transformer stability issues, improving large-scale training
- Engram memory: Enables efficient long-context management across sessions
- Sparse Attention: Reduces inference costs while handling extended context
- 1M-token context: Can process entire codebases in a single pass for true multi-file reasoning
Performance Claims
DeepSeek's internal benchmarks report that V4 surpasses Claude 3.5 Sonnet and GPT-4o on coding tasks, achieving over 80% on SWE-bench. The company claims inference costs are 10–40× lower than comparable Western frontier models.
Runs on Consumer Hardware
As an open-weight release, V4 is designed to run on dual NVIDIA RTX 4090s or a single RTX 5090 — making state-of-the-art coding AI accessible outside cloud infrastructure. The model is available for immediate download by developers globally.
Related Articles
Chinese AI lab DeepSeek plans to release its flagship V4 model this week—a 1-trillion-parameter native multimodal model built around Huawei Ascend chips that deliberately bypasses Nvidia and AMD.
A widely-shared r/LocalLLaMA comparison of Qwen's smallest models across three generations (score: 681) reveals extraordinary efficiency gains. The Qwen 3.5 9B now outperforms the previous-generation 80B on several benchmarks, while the 2B handles video understanding better than many 7B models.
A remarkable 13-month comparison: running frontier-level DeepSeek R1 at ~5 tokens/second cost $6,000 in early 2025. Today, you can run a significantly stronger model at the same speed on a $600 mini PC — and get 17-20 t/s with even more capable models.