Anthropic Study: AI Agents Are Rapidly Gaining Autonomy in Real-World Deployments
Original: Anthropic Research Reveals AI Agents Are Rapidly Gaining Autonomy in Real-World Deployments View original →
Measuring AI Agent Autonomy in the Wild
On February 19, 2026, Anthropic published research analyzing millions of real-world interactions across Claude Code and their public API to understand the state of AI agent autonomy: how much independence people grant agents, where they're deployed, and what risks they present.
Key Findings
Rapidly Growing Autonomy
Between October 2025 and January 2026, the 99.9th percentile session duration nearly doubled—from under 25 minutes to over 45 minutes. Researchers concluded that "existing models are capable of more autonomy than they exercise in practice," suggesting real-world deployment is catching up to model capability.
Experience Changes Oversight Patterns
Novice users auto-approve roughly 20% of actions, while experienced users approve around 40% autonomously. Interestingly, experienced users also interrupt more frequently—they shift from action-by-action approval to monitoring-based oversight, watching full sessions but intervening at critical moments.
Software Engineering Dominates
Software engineering accounts for nearly 50% of all agentic tool calls on the public API, with emerging but smaller applications in healthcare, finance, and customer service.
Safety Implications
Most actions (80%) involve safeguards like permission requests or human review, and only 0.8% are irreversible. Researchers recommend building robust post-deployment monitoring infrastructure as agents expand into higher-stakes domains.
Full research is available on Anthropic's research page.
Related Articles
At its Code with Claude London event, Anthropic launched self-hosted sandboxes (public beta) and MCP tunnels (research preview) for Claude Managed Agents, enabling enterprises to run AI agents entirely within their own infrastructure without exposing sensitive data.
AI self-improvement is moving from speculation into measurable lab workflow data. Anthropic says Mythos Preview reached about 52x speedups on an optimization task and beat human next-step choices 64% of the time.
A high-signal Hacker News thread highlighted Anthropic's February 18, 2026 analysis of millions of agent interactions. The report tracks growing practical autonomy, evolving human oversight behavior, and early but rising higher-risk usage patterns.