#deepseek - Insights

AI 3h ago 2 min read

Washington turns model distillation into diplomacy, with DeepSeek in the crosshairs

This matters because the fight over model copying is no longer staying inside lobbying letters and company blog posts. Reuters reported on April 26 that the U.S. State Department told diplomats worldwide to warn foreign governments about AI models allegedly distilled from U.S. systems, naming DeepSeek and also mentioning Moonshot AI and MiniMax.

#deepseek #distillation #policy

2

LLM 1d ago 2 min read

DeepSeek cuts input cache pricing to one-tenth across its full API line

Cache-hit pricing can decide whether long-context assistants are cheap enough to ship. DeepSeek said the entire API series now charges just one-tenth of the old rate for input cache hits, while keeping a 75% off V4-Pro promotion live.

#deepseek #api-pricing #caching

1

AI sources.twitter 2d ago 2 min read

LMSYS posts Day-0 DeepSeek-V4 speeds up to 266 tok/s on H200

Why it matters: model launches live or die on serving and training support, not just weights. LMSYS says its Day-0 stack reached 199 tok/s on B200 and 266 tok/s on H200, while staying strong out to 900K context.

#lmsys #deepseek #benchmarks

4

AI sources.twitter 2d ago 2 min read

DeepSeek-V4 opens 1M context with 1.6T/49B and 284B/13B split

Why it matters: open models rarely arrive with both giant context claims and deployable model splits. DeepSeek put hard numbers on the release with a 1M-context design, a 1.6T/49B Pro model, and a 284B/13B Flash variant.

#deepseek #open-weights #llm

3

LLM Hacker News 3d ago 2 min read

HN Spots the Real DeepSeek V4 Story: The Docs Link Was Thin, but the Weights Were Already Live

HN did not latch onto DeepSeek V4 because of a polished launch page. The thread took off when commenters realized the front-page link was just updated docs while the weights and base models were already live for inspection.

#deepseek #llm #moe

5

LLM Reddit 4d ago 2 min read

Why LocalLLaMA treated DeepEP V2 and TileKernels as more than just another infra drop

LocalLLaMA upvoted this because it felt like real plumbing, not another benchmark screenshot. The excitement was about DeepSeek open-sourcing faster expert-parallel communication and reusable GPU kernels.

#deepseek #deepep #tilekernels

4

LLM Reddit Mar 2, 2026 1 min read

13 Months After the DeepSeek Moment: How Far Has Local AI Come?

A remarkable 13-month comparison: running frontier-level DeepSeek R1 at ~5 tokens/second cost $6,000 in early 2025. Today, you can run a significantly stronger model at the same speed on a $600 mini PC — and get 17-20 t/s with even more capable models.

#local-llm #deepseek #qwen

36

LLM Reddit Mar 1, 2026 1 min read

DeepSeek V4 Launching Next Week with Image and Video Generation

The Financial Times reports that DeepSeek V4 is set to launch next week, featuring image and video generation capabilities that position it as a direct competitor to multimodal AI models from OpenAI and Google.

#deepseek #llm #image-generation

30

LLM Reddit Feb 26, 2026 1 min read

Reddit Spotlights DeepSeek DualPath for KV-Cache I/O Bottlenecks in Agentic LLMs

A trending r/LocalLLaMA thread highlighted the DualPath paper on KV-Cache bottlenecks in disaggregated inference systems. The arXiv abstract reports up to 1.87x offline throughput and 1.96x average online throughput gains while meeting SLO.

#llm-inference #kv-cache #rdma

28

AI sources.twitter Feb 24, 2026 1 min read

Anthropic Exposes Industrial-Scale AI Distillation Attacks by DeepSeek, Moonshot AI, and MiniMax

Anthropic revealed that Chinese AI labs DeepSeek, Moonshot AI, and MiniMax created over 24,000 fraudulent accounts and generated 16+ million Claude exchanges to extract its capabilities and improve their own competing models.

#anthropic #ai-safety #deepseek

20

AI sources.twitter Feb 24, 2026 1 min read

Anthropic Exposes Industrial-Scale AI Distillation Attacks by DeepSeek, Moonshot AI, and MiniMax

Anthropic revealed that Chinese AI labs DeepSeek, Moonshot AI, and MiniMax created over 24,000 fraudulent accounts and generated 16+ million Claude exchanges to extract its capabilities and improve their own competing models.

#anthropic #ai-safety #deepseek

24

AI sources.twitter Feb 24, 2026 1 min read

Anthropic Exposes Industrial-Scale AI Distillation Attacks by DeepSeek, Moonshot AI, and MiniMax

Anthropic revealed that Chinese AI labs DeepSeek, Moonshot AI, and MiniMax created over 24,000 fraudulent accounts and generated 16+ million Claude exchanges to extract its capabilities and improve their own competing models.

#anthropic #ai-safety #deepseek

22