#kimi

LLM sources.twitter 3d ago 2 min read

Kimi K2.6 scales agent swarms to 300 workers and 4,000 coordinated steps

Why it matters: Moonshot is turning “agent swarm” from a demo phrase into an execution claim with real scale numbers. The Kimi post says one run can coordinate 300 sub-agents across 4,000 steps and return 100-plus files instead of chat transcripts.

#moonshot #kimi #agent-swarm

LLM Hacker News 5d ago 2 min read

Kimi K2.6 turned HN’s model debate toward open-weight coding agents

HN read Kimi K2.6 as a test of whether open-weight coding agents can last through real engineering work. The 12-hour and 13-hour coding cases drew attention, while commenters immediately pressed on speed, provider accuracy, and benchmark realism.

#kimi #coding-agents #open-weights

LLM Reddit Apr 19, 2026 2 min read

A 145-result coding eval put Kimi K2.6, Opus 4.7, GLM 5.1 and Minimax under LocalLLaMA review

LocalLLaMA cared about this eval post because it mixed leaderboard data with lived coding-agent pain: Opus 4.7 scored well, but the author says it felt worse in real use.

#coding-agents #benchmarks #kimi

LLM Apr 17, 2026 2 min read

Cloudflare cuts Kimi K2.5 token latency to 20-30 ms

Cloudflare says Workers AI has made Kimi K2.5 3x faster for agent workloads. The technical change pushed p90 time per token from roughly 100 ms to 20-30 ms and raised peak input-token cache hit ratios from 60% to 80% with heavy internal users.

#cloudflare #inference #kimi

LLM Reddit Mar 18, 2026 2 min read

r/MachineLearning highlights Attention Residuals as Kimi targets fixed-sum PreNorm bottlenecks

A Reddit thread surfaced Kimi's AttnRes paper, which argues that fixed residual accumulation in PreNorm LLMs dilutes deeper layers. The proposed attention-based residual path and its block variant aim to keep the gains without exploding memory cost.

#kimi #llm-architecture #attention