LLM

LLM 2d ago 2 min read

GitHub trims Copilot agent startup by 20% to cut dead time

The improvement sounds small until you remember where agent products lose trust: waiting. GitHub says its Copilot cloud agent now starts more than 20% faster, building on a 50% startup improvement shipped in March.

#github #copilot #agents

LLM Reddit 2d ago 2 min read

LocalLLaMA sees 38.2% as the moment local coding stops feeling theoretical

The spark in LocalLLaMA was not the raw score alone. The post landed because a 38.2% Terminal-Bench 2.0 result for Qwen 3.6-27B was framed as roughly late-2025 frontier quality, putting air-gapped and privacy-heavy coding teams into a new decision zone.

#qwen #terminal-bench #local-llms

LLM Reddit 2d ago 2 min read

LocalLLaMA likes Luce DFlash because the 3090 speedup looks practical

LocalLLaMA did not treat Luce DFlash as another benchmark screenshot. The post took off because it promised almost 2x mean throughput for Qwen3.6-27B on a single RTX 3090, with no retraining and enough memory engineering to keep long-context local inference practical.

#qwen #speculative-decoding #gguf

LLM Hacker News 2d ago 2 min read

HN likes Talkie less as nostalgia and more as a clean test of what LLMs generalize

The retro hook got clicks, but Hacker News kept returning to a more serious question: a 13B model trained only on pre-1931 text makes contamination-free evaluation possible, and its simple Python wins are more interesting to the thread than its antique voice.

#talkie #vintage-llm #model-evals

LLM Reddit 2d ago 2 min read

LocalLLaMA sees Hipfire as the AMD-first inference bet worth watching

LocalLLaMA immediately locked onto the thing AMD users rarely get from new tooling: hard numbers instead of vague promises. The thread heated up because Hipfire arrived with RDNA-focused benchmark tables and users were already posting their own measurements under it.

#amd #rdna #inference

LLM Hacker News 2d ago 2 min read

HN likes EvanFlow for the parts it refuses to automate

HN did not read EvanFlow as another shiny agent wrapper so much as a set of brakes for agentic coding. Checkpoints, integration contracts, and explicit no-auto-commit rules drew more attention than the TDD label itself.

#claude-code #tdd #coding-agents

LLM Hacker News 2d ago 2 min read

HN thinks the SWE-bench story is about contamination, not bragging rights

HN treated OpenAI's post less as benchmark housekeeping and more as an obituary for a famous coding leaderboard. The thread cared far more about flawed tests and contamination than about who happened to top the chart first.

#openai #swe-bench #evals

LLM 2d ago 2 min read

GitHub puts Copilot on the meter, and long agent runs now change the bill

This matters because Copilot is no longer priced like a lightweight autocomplete tool. Starting June 1, 2026, GitHub will convert every Copilot plan to token-based AI Credits, end the fallback model safety net, and make code review consume GitHub Actions minutes too.

#github #copilot #pricing

LLM X/Twitter 3d ago 1 min read

Xiaomi open-sources MiMo-V2.5 with 1M context and MIT terms

This matters because Xiaomi just put a frontier-scale model family behind permissive terms instead of a closed API gate. The MiMo-V2.5 release promises a 1M-token context window, MIT licensing for commercial use and fine-tuning, and a Pro variant Xiaomi says leads open models on GDPVal-AA and ClawEval.

#xiaomi #mimo-v2.5 #open-source

LLM X/Twitter 3d ago 1 min read

Arena puts GPT-5.5 at #2 in search and +50 in Code Arena

This matters because it gives a fast third-party read on GPT-5.5 beyond launch-day marketing. Arena says GPT-5.5 landed at #2 in Search Arena, #5 in Expert Arena, and #9 in Code Arena with a 50-point gain over GPT-5.4.

#openai #gpt-5.5 #arena

LLM X/Twitter 3d ago 2 min read

OpenAI open-sources Symphony after a 500% PR jump on some teams

This matters because the next bottleneck in agent coding is human attention, not raw model speed. OpenAI says Symphony lifted landed pull requests by 500% on some teams after engineers hit a practical ceiling of roughly three to five concurrent Codex sessions.

#openai #codex #agents

LLM Reddit 3d ago 2 min read

LocalLLaMA lights up over Hipfire as AMD finally gets its own inference speed story

LocalLLaMA upvoted Hipfire because it felt like overdue attention for RDNA users, not just another repo drop. The thread filled with early tests showing multi-fold decode gains and immediate questions about quant formats and compatibility.

#amd #rdna #inference