#agents

LLM X/Twitter Mar 8, 2026 2 min read

Azure adds GPT-5.4 to Microsoft Foundry for production-grade agent workloads

Azure says GPT-5.4 is now available in Microsoft Foundry for production-grade agent workloads. Microsoft’s supporting post adds GPT-5.4 Pro, pricing, and initial deployment options, with governance controls positioned as part of the pitch.

#azure #microsoft-foundry #gpt-5.4

LLM X/Twitter Mar 8, 2026 1 min read

OpenAI updates GPT-5.4 prompting guidance for more reliable agents

OpenAI Developers has updated its GPT-5.4 API prompting guide. The new guidance focuses on tool use, structured outputs, verification loops, and long-running workflows for production-grade agents.

#openai #gpt-5.4 #prompting

LLM Mar 7, 2026 2 min read

OpenAI introduces Stateful Runtime for agents in Amazon Bedrock

OpenAI and Amazon said AWS customers will get a Stateful Runtime Environment in Amazon Bedrock for production-grade agent workflows. The announcement moves agent execution closer to managed AWS infrastructure with persistent state, governance, and long-running workflow support.

#openai #amazon-bedrock #agents

LLM Mar 6, 2026 1 min read

Microsoft Research Introduces CORPGEN for Multi-Task Enterprise Agents

Microsoft Research introduced CORPGEN on February 26, 2026 to evaluate and improve agent performance in realistic multi-task office scenarios. The framework reports up to 3.5x higher task completion than baseline systems under heavy concurrent load.

#microsoft #agents #corpgen

LLM Mar 6, 2026 2 min read

OpenAI Upgrades Operator With Slides Editing and Browser Jupyter Execution

OpenAI announced an Operator upgrade adding Google Drive slides creation/editing and Jupyter-mode code execution in Browser. It also said Operator availability expanded to 20 additional regions in recent weeks, with new country additions including Korea and several European markets.

#openai #operator #agents

LLM Mar 3, 2026 1 min read

Anthropic Embeds Claude Directly into Excel, PowerPoint, and Enterprise Tools

Anthropic launched Claude Cowork plugins that embed Claude natively into Microsoft Excel, PowerPoint, Slack, Gmail, and Google Drive—enabling autonomous cross-app workflows for enterprise users.

#anthropic #product-launch #agents

AI Mar 1, 2026 2 min read

Google DeepMind Introduces SIMA 2 for Generalist Agents in Virtual 3D Worlds

Google DeepMind announced SIMA 2 on November 13, 2025 as a generalist foundation model for virtual 3D environments. The system is designed to play and reason alongside humans, with in-context learning that can improve behavior from examples.

#agents #embodied-ai #deepmind

AI Hacker News Feb 27, 2026 2 min read

FDM-1 Claims a General Computer Action Model Trained on 11M Hours of Video

A high-scoring Hacker News post spotlights FDM-1, a video-native computer action model trained on an 11-million-hour dataset. The release emphasizes automatic action labeling with IDM and large-scale forking-VM evaluation for long-horizon interaction tasks.

#computer-use #foundation-model #video-model

LLM Feb 26, 2026 2 min read

Google Previews Gemini Multi-Step Task Automation on Android

Google announced on 2026-02-25 that Gemini in Android will begin handling multi-step tasks in beta. The rollout starts on Pixel 10 devices and Samsung Galaxy S26 series, initially in the U.S. and Korea.

#gemini #android #agents

AI X/Twitter Feb 22, 2026 1 min read

Karpathy: The App Store Is an Outdated Concept — The Era of Bespoke AI Software Is Here

Andrej Karpathy shared how he vibe-coded a custom health tracking dashboard in 1 hour, then argued that the traditional app store model is becoming obsolete as LLM agents can generate bespoke apps on-demand for individual users.

#karpathy #vibe-coding #agents

LLM Reddit Feb 16, 2026 2 min read

LocalLLaMA Spotlights MiniMax-M2.5 as Hugging Face Release Gains Traction

A high-engagement r/LocalLLaMA thread tracked the MiniMax-M2.5 release on Hugging Face. The model card emphasizes agentic coding/search benchmarks, runtime speedups, and aggressive cost positioning.

#minimax #llm #agents

Sciences Hacker News Feb 16, 2026 2 min read

Towards Autonomous Mathematics Research Hits Hacker News: Aletheia Framed as a Research Agent

A Hacker News thread highlighted arXiv 2602.10177, where DeepMind researchers introduce Aletheia, an agent workflow for mathematics research. The paper claims progress from Olympiad-style reasoning toward PhD-level tasks and semi-autonomous open-problem exploration.

#mathematics #ai-research #agents