DeepSeek-V4 opens 1M context with 1.6T/49B and 284B/13B split
Original: DeepSeek-V4 Preview is live, open-sourced, and built around 1M context View original →
What the post changed
DeepSeek moved its next flagship model from rumor to runnable release in one shot. The official account wrote that “DeepSeek-V4 Preview is officially live & open-sourced” and paired that with a concrete spec sheet rather than vague capability claims. The tweet says the Pro model uses 1.6T total parameters with 49B active, while the Flash model uses 284B total with 13B active, both positioned around a 1M context length. That matters because open-weight launches often hide the serving tradeoffs; this post exposed the split directly.
“DeepSeek-V4-Pro: 1.6T total / 49B active params… DeepSeek-V4-Flash: 284B total / 13B active params… API is updated & available today!”
The account is DeepSeek’s primary release channel, so it usually carries first-party model rollouts rather than commentary. The linked material matters almost as much as the tweet itself. DeepSeek attached a technical report hosted on Hugging Face and an open-weights collection page, which turns the post from marketing copy into a package developers can inspect. The launch link also points users straight to chat.deepseek.com, signaling that the company wants immediate hands-on use instead of a waitlist cycle.
Why the split is the real story
The interesting design choice is not only model size but the two-lane product shape. A Pro tier with 49B active parameters targets frontier quality, while a Flash tier with 13B active parameters gives DeepSeek a cheaper and faster lane for production traffic. That is a more operationally useful framing than a single giant checkpoint. It suggests DeepSeek is trying to win on cost control as much as on raw evaluation scores, especially now that long context is becoming table stakes for coding, agents, and document-heavy work.
What to watch next is whether independent benchmarks validate the 1M-context promise under real workloads and whether the updated API pricing lands in a range that pressures other open and closed providers. The source tweet already drew more than 8.4 million views, which shows the market was waiting for a concrete open release, not another teaser. Source: DeepSeek source tweet · technical report · open weights collection
Related Articles
HN treated “AI cybersecurity is not proof of work” as a serious argument about search, model capability, and security asymmetry. The thread pushed past hype into a harder question: when an LLM flags a bug, did it understand the exploit path or just sample a suspicious pattern?
r/singularity reacted because the post turned LLM consciousness into a fight over computation itself. Alexander Lerchner’s “Abstraction Fallacy” paper argues that computation depends on a mapmaker, while commenters pushed back with questions about definitions, Chinese Room echoes, and philosophy versus neuroscience.
HN cared because this was not an abstract AI ethics fight; it was a maintainer workflow problem with licensing risk attached. SDL merged PR #15353 on April 15, adding an AGENTS.md that tells contributors not to use LLMs to generate code.
Comments (0)
No comments yet. Be the first to comment!