DeepSeek to Release V4 This Week: 1-Trillion-Parameter Multimodal Model Optimized for Huawei Chips
Chinese AI research lab DeepSeek is set to release DeepSeek V4 this week, per sources cited by TechNode and the Financial Times on March 2, 2026. The model has been repeatedly delayed since mid-February, with its release now timed to coincide with the start of China's annual Two Sessions parliamentary meetings on March 4.
V4 is built on a Mixture-of-Experts (MoE) architecture with approximately 32 billion active parameters from a total of 1 trillion parameters. It is a native multimodal model—trained from the ground up on text, images, video, and audio—with a context window of up to 1 million tokens. Leaked benchmarks suggest V4 would score around 90% on HumanEval and above 80% on SWE-Bench Verified, putting it in the same tier as Claude Opus 4.6 and GPT-5.3 Codex on coding tasks (unverified).
A notable strategic decision: DeepSeek deliberately excluded Nvidia and AMD from the pre-release optimization pipeline, building V4's inference stack entirely around Huawei Ascend and Cambricon chips. This positions V4 as a flagship model for Chinese domestic AI hardware, reflecting a broader trend of reducing dependence on US export-controlled semiconductors.
Three new architectural innovations are claimed: Manifold-Constrained Hyper-Connections (training stability at scale), Engram Conditional Memory (efficient retrieval across million-token contexts), and an enhanced Lightning Indexer for sparse attention.
Read more at TechNode.
Related Articles
DeepSeek released V4 on Lunar New Year with 1 trillion parameters, 1M-token context windows, and novel mHC architecture. The open-weight model claims benchmark-topping coding performance at 10–40× lower inference costs than Western frontier models.
Alibaba launched Qwen3.5, a 397B-parameter open-weight multimodal model supporting 201 languages. The company claims it outperforms GPT-5.2, Claude Opus 4.5, and Gemini 3 on benchmarks, while costing 60% less than its predecessor.
A widely-shared r/LocalLLaMA comparison of Qwen's smallest models across three generations (score: 681) reveals extraordinary efficiency gains. The Qwen 3.5 9B now outperforms the previous-generation 80B on several benchmarks, while the 2B handles video understanding better than many 7B models.
Comments (0)
No comments yet. Be the first to comment!