Anthropic is pushing Claude Code beyond one-off coding sessions and into persistent workflow automation. In research preview, routines can launch from 3 trigger types—schedules, API calls, and GitHub events—and are available across 4 paid plan tiers when Claude Code on the web is enabled.
LLM
RSS FeedReddit lit up around a build that turns a Xiaomi 12 Pro into a headless Gemma 4 server because it feels much closer to how most people actually tinker with local AI. The excitement was not about peak numbers, it was about proving that useful local inference can live on everyday hardware.
HN reacted fast because I-DLM is not selling faster text generation someday; it is claiming diffusion-style decoding can keep pace with autoregressive quality now. The thread quickly turned into a reality check on whether the 2.9x-4.1x throughput story can survive real inference stacks.
LiteCoder is making a case that smaller coding agents still have room to climb, releasing terminal-focused models plus 11,255 trajectories and 602 Harbor environments. Its 30B model reaches 31.5% Pass@1 on Terminal Bench Pro, up from 22.0% in the preview.
Cloudflare is moving agent infrastructure out of demo mode: Sandboxes and Containers are now generally available, with 7 recent upgrades aimed at persistent coding workflows. The stack now bundles PTY terminals, credential injection, stateful interpreters, background processes, file watching, snapshots, and higher limits.
The LocalLLaMA thread took off because native speech-to-text inside llama.cpp is exactly the kind of feature that removes an extra pipeline from local agent setups. The post says llama-server can now run STT with Gemma-4 E2A and E4A models, and commenters immediately started comparing the practical experience to Whisper and Voxtral.
Hacker News liked the idea immediately, but the comments also went straight to the hard question: how useful is more autonomy if usage limits stay tight. Anthropic’s new Claude Code Routines package a prompt, repositories, and connectors into cloud-run automations that can fire on schedules, API calls, or GitHub events.
GitHub is making third-party coding agents less static: Claude and Codex users on github.com can now choose among 4 Anthropic models and 3 OpenAI models when they launch a task. That matters because model choice changes latency, spend, and code quality far more than a small UI toggle suggests.
GitHub is turning Copilot compliance from slideware into deployable policy: US and EU data residency now covers all generally available Copilot features, and US government deployments get FedRAMP Moderate infrastructure. The practical catch is cost, with data-resident requests priced at a 1.1x model multiplier.
OpenAI is separating defensive cyber use from broad model access: verified individuals and vetted teams can now reach a cyber-permissive GPT-5.4 variant with binary reverse engineering support. The move matters because TAC is expanding from a narrow program to thousands of defenders and hundreds of teams.
Anthropic is using Claude not just as a model to align, but as a researcher that improved weak-to-strong supervision nearly to the ceiling. In the linked study, nine Claude Opus 4.6 agents pushed performance-gap recovery from a 0.23 human baseline to 0.97 after 800 cumulative research hours.
r/MachineLearning treated this less like a finished breakthrough and more like a serious challenge to the current assumptions around large-scale spike-domain training. The April 13, 2026 post reported a 1.088B pure SNN language model reaching loss 4.4 at 27K steps with 93% sparsity, while commenters pushed for more comparable metrics and longer training before drawing big conclusions.