DeepSeek V4 Launches: 1 Trillion Parameters, 1M Context, Open-Weight
DeepSeek's Most Ambitious Release Yet
Chinese AI startup DeepSeek released DeepSeek V4 on February 17, coinciding with the Lunar New Year. The model features 1 trillion total parameters, a 1-million-token context window, and three architectural innovations: mHC (Manifold-Constrained Hyper-Connections), Engram conditional memory, and Sparse Attention. It is released as an open-weight model.
Technical Highlights
- mHC architecture: Addresses fundamental Transformer stability issues, improving large-scale training
- Engram memory: Enables efficient long-context management across sessions
- Sparse Attention: Reduces inference costs while handling extended context
- 1M-token context: Can process entire codebases in a single pass for true multi-file reasoning
Performance Claims
DeepSeek's internal benchmarks report that V4 surpasses Claude 3.5 Sonnet and GPT-4o on coding tasks, achieving over 80% on SWE-bench. The company claims inference costs are 10–40× lower than comparable Western frontier models.
Runs on Consumer Hardware
As an open-weight release, V4 is designed to run on dual NVIDIA RTX 4090s or a single RTX 5090 — making state-of-the-art coding AI accessible outside cloud infrastructure. The model is available for immediate download by developers globally.
Related Articles
Chinese AI lab DeepSeek plans to release its flagship V4 model this week—a 1-trillion-parameter native multimodal model built around Huawei Ascend chips that deliberately bypasses Nvidia and AMD.
A r/LocalLLaMA benchmark compared 21 local coding models on HumanEval+, speed, and memory, putting Qwen 3.6 35B-A3B on top while surfacing practical RAM and tok/s trade-offs.
Why it matters: document agents fail when PDF parsing destroys table and column structure. LiteParse uses a monospace grid projection approach instead of heavy layout models, and the code is open source.
Comments (0)
No comments yet. Be the first to comment!