DeepSeek V4 Launches: 1 Trillion Parameters, 1M Context, Open-Weight

DeepSeek's Most Ambitious Release Yet

Chinese AI startup DeepSeek released DeepSeek V4 on February 17, coinciding with the Lunar New Year. The model features 1 trillion total parameters, a 1-million-token context window, and three architectural innovations: mHC (Manifold-Constrained Hyper-Connections), Engram conditional memory, and Sparse Attention. It is released as an open-weight model.

Technical Highlights

mHC architecture: Addresses fundamental Transformer stability issues, improving large-scale training
Engram memory: Enables efficient long-context management across sessions
Sparse Attention: Reduces inference costs while handling extended context
1M-token context: Can process entire codebases in a single pass for true multi-file reasoning

Performance Claims

DeepSeek's internal benchmarks report that V4 surpasses Claude 3.5 Sonnet and GPT-4o on coding tasks, achieving over 80% on SWE-bench. The company claims inference costs are 10–40× lower than comparable Western frontier models.

Runs on Consumer Hardware

As an open-weight release, V4 is designed to run on dual NVIDIA RTX 4090s or a single RTX 5090 — making state-of-the-art coding AI accessible outside cloud infrastructure. The model is available for immediate download by developers globally.

Source: Introl, Vertu

LLM Mar 3, 2026 1 min read

DeepSeek to Release V4 This Week: 1-Trillion-Parameter Multimodal Model Optimized for Huawei Chips

Chinese AI lab DeepSeek plans to release its flagship V4 model this week—a 1-trillion-parameter native multimodal model built around Huawei Ascend chips that deliberately bypasses Nvidia and AMD.

#open-source #research #benchmark

LLM Reddit 4d ago 2 min read

LocalLLaMA Gets a MacBook Air M5 Benchmark for 21 Coding Models, Not Just Vibes

A r/LocalLLaMA benchmark compared 21 local coding models on HumanEval+, speed, and memory, putting Qwen 3.6 35B-A3B on top while surfacing practical RAM and tok/s trade-offs.

#localllama #benchmark #qwen

LLM sources.twitter 4d ago 2 min read

LlamaIndex LiteParse keeps PDF tables intact with grid projection

Why it matters: document agents fail when PDF parsing destroys table and column structure. LiteParse uses a monospace grid projection approach instead of heavy layout models, and the code is open source.

#llamaindex #liteparse #pdf-parsing