#agentic-coding

LLM Hacker News Jun 16, 2026 2 min read

Local models are crossing from hobby setup into coding workflow

HN focused less on whether local LLMs fully replace frontier models and more on where they already make sense. The thread turned into a practical debate about Gemma, Qwen, agentic coding, memory limits, cost, and privacy.

#local-llm #agentic-coding #gemma

LLM X/Twitter May 31, 2026 1 min read

grok-build-0.1 API beta prices coding agents at $1/$2 per million tokens

xAI is turning Grok Build from a CLI-backed experience into a public API beta. The headline number is pricing: $1 per million input tokens and $2 per million output tokens for agentic coding workloads.

#xai #grok #agentic-coding

LLM X/Twitter May 15, 2026 1 min read

xAI Launches Grok Build: An Agentic CLI for Coding, Building, and Automation

xAI has released an early beta of Grok Build, an agentic CLI tool for coding, building apps, and automating workflows, available now to SuperGrok Heavy subscribers. The announcement drew over 41 million views, signaling massive developer interest.

#xai #grok #cli

LLM X/Twitter Apr 29, 2026 2 min read

Poolside opens Laguna XS.2, a 33B/3B coding model for one GPU

Open-weight coding models that can run locally are still scarce. Poolside has pushed Laguna XS.2 into that lane with a 33B total / 3B active MoE that fits a single GPU, and its technical note claims 44.5% on SWE-bench Pro.

#poolside #laguna-xs.2 #open-weights

LLM Hacker News Apr 29, 2026 2 min read

HN turned a Claude managed-agent bug into a debate about token burn and trust

HN latched onto the money leak before the bug itself. A report that Claude Managed Agents append a malware reminder to every file read, then sometimes refuse to edit code anyway, turned into a broader argument about opaque token spend and whether agent harnesses deserve more scrutiny.

#claude-code #managed-agents #prompting

AI Apr 28, 2026 2 min read

GitHub shifts to 30x scale as agentic coding starts breaking ops

GitHub is no longer talking about routine uptime tuning. In its April 28 update, the company said a 10x capacity plan launched in October 2025 had to be reworked for 30x scale by February 2026, after recent incidents hit 230 repositories and 2,092 pull requests.

#github #reliability #agentic-coding

AI Hacker News Apr 28, 2026 2 min read

HN reads Copilot's pricing change as the end of cheap agentic coding

Hacker News did not focus on the headline that plan prices stay flat. The thread zeroed in on a simpler point: on April 27, 2026, GitHub admitted that long agentic coding sessions cannot be subsidized forever, and predictable Copilot costs are giving way to token math.

#github #copilot #pricing

LLM X/Twitter Apr 25, 2026 2 min read

OpenAI puts GPT-5.5 live with 82.7% Terminal-Bench gains

OpenAI is pushing harder into agentic work, not just chat. On the company's own evals, GPT-5.5 reaches 82.7% on Terminal-Bench 2.0, beats GPT-5.4 by 7.6 points, and uses fewer tokens in Codex.

#openai #gpt-5-5 #codex

LLM Apr 25, 2026 2 min read

GPT-5.5 pushes agentic coding higher without adding latency

OpenAI is pitching GPT-5.5 as more than a routine model refresh. With 82.7% on Terminal-Bench 2.0, 58.6% on SWE-Bench Pro, and a claim that it keeps GPT-5.4-level latency, the company is resetting expectations for long-running coding agents.

#openai #gpt-5.5 #codex

LLM Hacker News Apr 14, 2026 2 min read

Hacker News dissects a Claude Code quota dispute where prompt caching meets 1M-context agent workflows

A large Hacker News thread turned a Claude Code quota complaint into a deeper argument about how prompt caching, background sessions, and auto-compacts behave inside 1M-context agent workflows. The GitHub issue author published April 9, 2026 usage logs, and the discussion quickly shifted from “limits feel worse” to cache accounting and quota transparency.

#claude-code #prompt-caching #agentic-coding

LLM Hacker News Apr 8, 2026 2 min read

Hacker News Sees GLM-5.1 Push Further Into Long-Horizon Agentic Engineering

Hacker News picked up Z.ai's GLM-5.1 as a model aimed less at one-shot wins and more at sustained agentic work. Z.ai reports 58.4 on SWE-Bench Pro, 42.7 on NL2Repo, 66.5 on Terminal Bench 2.0, and long-horizon runs that keep improving through hundreds of iterations and thousands of tool calls.

#glm-5.1 #agentic-coding #swe-bench

LLM X/Twitter Apr 5, 2026 1 min read

Cursor details Composer 2's training stack in a new technical report

Cursor has published a technical report for Composer 2, outlining a two-stage recipe of continued pretraining and large-scale reinforcement learning for agentic software engineering. The company says the model reaches 61.3 on CursorBench, 61.7 on Terminal-Bench, and 73.7 on SWE-bench Multilingual while keeping pricing at $0.50/M input and $2.50/M output tokens.

#cursor #composer-2 #coding-model

101