A merged llama.cpp PR adds MCP server selection, tool calls, prompts, resources, and an agentic loop to the WebUI stack, moving local inference closer to full agent workflows.
#agentic-workflows
LocalLLaMA users are tracking llama.cpp’s merged autoparser work, which analyzes model templates to support reasoning and tool-call formats with less custom parser code.
A high-scoring r/LocalLLaMA post details a practical move from Ollama/LM Studio-centric flows to llama-swap for multi-model operations. The key value discussed is operational control: backend flexibility, policy filters, and low-friction service management.
GitHub announced public preview availability of Copilot’s cross-agent memory for Copilot coding agent, Copilot CLI, and Copilot code review. The system is repository-scoped, citation-verified, opt-in, and accompanied by reported improvements in evaluation and A/B test metrics.
Anthropic introduced Claude Sonnet 4.6 with a 1M token context window (beta), stronger coding/computer-use performance, and unchanged API pricing at $3/$15 per million tokens.