DeepSeek to Release V4 This Week: 1-Trillion-Parameter Multimodal Model Optimized for Huawei Chips

Chinese AI research lab DeepSeek is set to release DeepSeek V4 this week, per sources cited by TechNode and the Financial Times on March 2, 2026. The model has been repeatedly delayed since mid-February, with its release now timed to coincide with the start of China's annual Two Sessions parliamentary meetings on March 4.

V4 is built on a Mixture-of-Experts (MoE) architecture with approximately 32 billion active parameters from a total of 1 trillion parameters. It is a native multimodal model—trained from the ground up on text, images, video, and audio—with a context window of up to 1 million tokens. Leaked benchmarks suggest V4 would score around 90% on HumanEval and above 80% on SWE-Bench Verified, putting it in the same tier as Claude Opus 4.6 and GPT-5.3 Codex on coding tasks (unverified).

A notable strategic decision: DeepSeek deliberately excluded Nvidia and AMD from the pre-release optimization pipeline, building V4's inference stack entirely around Huawei Ascend and Cambricon chips. This positions V4 as a flagship model for Chinese domestic AI hardware, reflecting a broader trend of reducing dependence on US export-controlled semiconductors.

Three new architectural innovations are claimed: Manifold-Constrained Hyper-Connections (training stability at scale), Engram Conditional Memory (efficient retrieval across million-token contexts), and an enhanced Lightning Indexer for sparse attention.

DeepSeek V4 Launches: 1 Trillion Parameters, 1M Context, Open-Weight

DeepSeek released V4 on Lunar New Year with 1 trillion parameters, 1M-token context windows, and novel mHC architecture. The open-weight model claims benchmark-topping coding performance at 10–40× lower inference costs than Western frontier models.

#deepseek #open-source #benchmark

LLM Reddit 4d ago 2 min read

LocalLLaMA Gets a MacBook Air M5 Benchmark for 21 Coding Models, Not Just Vibes

A r/LocalLLaMA benchmark compared 21 local coding models on HumanEval+, speed, and memory, putting Qwen 3.6 35B-A3B on top while surfacing practical RAM and tok/s trade-offs.

#localllama #benchmark #qwen

LLM sources.twitter 4d ago 2 min read

LlamaIndex LiteParse keeps PDF tables intact with grid projection

Why it matters: document agents fail when PDF parsing destroys table and column structure. LiteParse uses a monospace grid projection approach instead of heavy layout models, and the code is open source.

#llamaindex #liteparse #pdf-parsing

DeepSeek to Release V4 This Week: 1-Trillion-Parameter Multimodal Model Optimized for Huawei Chips

Related Articles

DeepSeek V4 Launches: 1 Trillion Parameters, 1M Context, Open-Weight

LocalLLaMA Gets a MacBook Air M5 Benchmark for 21 Coding Models, Not Just Vibes

LlamaIndex LiteParse keeps PDF tables intact with grid projection

Comments (0)

Leave a Comment