DeepSeek turned a temporary V4-Pro API discount into standard pricing, intensifying the cost race around frontier-class LLM access. The posted table cuts output pricing from $3.48 to $0.87 per million tokens.
DeepSeek turned a temporary V4-Pro API discount into standard pricing, intensifying the cost race around frontier-class LLM access. The posted table cuts output pricing from $3.48 to $0.87 per million tokens.
Bloomberg reports DeepSeek is pushing forward with a $10.29 billion financing round. Founder Liang Wenfeng publicly reaffirmed commitment to open-source AI development and AGI over short-term commercialization.
DeepSeek V4 Pro tied with GPT-5.2 on FoodTruck Bench, a 30-day agentic benchmark using 34 tools, arriving roughly 10 weeks after GPT-5.2 was tested at approximately 17x lower cost.
DeepClaude keeps Claude Code's complete agent loop — file editing, bash, subagent spawning — while routing API calls to DeepSeek V4 Pro or other backends, cutting output token costs from $15/M to $0.87/M.
DeepSeek released DeepSeek-V4-Pro (1.6T total parameters, 49B active) and V4-Flash (284B total, 13B active), both Mixture-of-Experts models with MIT license and 1M token context. V4-Pro is the largest open-weights model released so far, and its pricing at $1.74/M input undercuts GPT-5.4 and Claude Sonnet 4.6 by more than half.
LocalLLaMA reacted hard because DeepSeek's visual-primitives idea makes points and boxes part of reasoning itself, and the repo going private only made the thread hotter.
This matters because the fight over model copying is no longer staying inside lobbying letters and company blog posts. Reuters reported on April 26 that the U.S. State Department told diplomats worldwide to warn foreign governments about AI models allegedly distilled from U.S. systems, naming DeepSeek and also mentioning Moonshot AI and MiniMax.
Cache-hit pricing can decide whether long-context assistants are cheap enough to ship. DeepSeek said the entire API series now charges just one-tenth of the old rate for input cache hits, while keeping a 75% off V4-Pro promotion live.
Why it matters: model launches live or die on serving and training support, not just weights. LMSYS says its Day-0 stack reached 199 tok/s on B200 and 266 tok/s on H200, while staying strong out to 900K context.
Why it matters: open models rarely arrive with both giant context claims and deployable model splits. DeepSeek put hard numbers on the release with a 1M-context design, a 1.6T/49B Pro model, and a 284B/13B Flash variant.
HN did not latch onto DeepSeek V4 because of a polished launch page. The thread took off when commenters realized the front-page link was just updated docs while the weights and base models were already live for inspection.
LocalLLaMA upvoted this because it felt like real plumbing, not another benchmark screenshot. The excitement was about DeepSeek open-sourcing faster expert-parallel communication and reusable GPU kernels.