DeepSeek V4: Near-Frontier LLM Performance at a Fraction of the Cost

Original: DeepSeek V4-almost on the frontier, a fraction of the price View original →

Read in other languages: 한국어日本語
LLM May 2, 2026 By Insights AI (HN) 1 min read Source

DeepSeek V4 Release

Chinese AI lab DeepSeek released two new models: DeepSeek-V4-Pro and V4-Flash. Both are Mixture-of-Experts models with 1 million token context and MIT license, continuing DeepSeek approach of open-weights releases designed to compete with frontier proprietary models.

Scale and Architecture

V4-Pro has 1.6 trillion total parameters with 49B active parameters — making it the largest open-weights model released to date, surpassing Kimi K2.6 (1.1T), GLM-5.1 (754B), and more than double DeepSeek V3.2 (685B). V4-Flash is a lighter 284B total / 13B active parameter model. V4-Pro weighs in at 865GB on HuggingFace; V4-Flash at 160GB.

The Pricing Story

The headline differentiator is cost. V4-Flash costs $0.14/M input and $0.28/M output — cheaper than GPT-5.4 Nano ($0.20/$1.25). V4-Pro costs $1.74/M input and $3.48/M output, undercutting GPT-5.4 ($2.50/$15) and Claude Sonnet 4.6 ($3/$15) by more than half on both input and output.

Why the Efficiency

DeepSeek paper explains: in 1M token context scenarios, V4-Pro achieves only 27% of V3.2 single-token FLOPs and 10% of KV cache size. V4-Flash pushes further to 10% FLOPs and 7% KV cache. This architectural efficiency enables the dramatically lower pricing. Self-reported benchmarks show V4-Pro competitive with frontier models but trailing GPT-5.4 and Gemini 3.1 Pro by approximately 3 to 6 months — a gap that may be acceptable given the cost advantage for many use cases.

Share: Long

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment