OpenAI gives Agents SDK native sandboxes for long-running work
Original: The next evolution of the Agents SDK View original →
The hard part of shipping agents is no longer just the model call. Production agents need to read files, run commands, edit code, preserve state, and keep working across many steps without turning every team into an infrastructure team. OpenAI’s April 15 Agents SDK update moves more of that execution loop into the SDK itself.
The release centers on two pieces: a model-native harness and native sandbox execution. The harness is built for agents that operate across files, documents, and systems, with configurable memory, sandbox-aware orchestration, and Codex-like filesystem tools. OpenAI is also standardizing the primitives that now show up in serious agent systems: MCP tool use, skills, AGENTS.md instructions, shell execution, apply patch edits, and related integrations.
The sandbox layer is the more practical product change. Agents can now run inside controlled computer environments with the files, tools, and dependencies they need for a task. Developers can bring their own sandbox or use supported providers including Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel. A new Manifest abstraction describes the workspace, mounts local files, defines output directories, and connects storage from AWS S3, Google Cloud Storage, Azure Blob Storage, and Cloudflare R2.
That design also answers a security problem that agent teams keep running into. OpenAI says agent systems should assume prompt injection and exfiltration attempts. Separating the harness from compute helps keep credentials out of the environment where model-generated code runs. Snapshotting and rehydration also make runs more durable: if a sandbox container expires or fails, the SDK can restore state in a fresh container and continue from the last checkpoint.
The new capabilities are generally available to API customers and use standard API pricing based on tokens and tool use. They launch first in Python, with TypeScript support planned later, along with code mode and subagents. The larger signal is that OpenAI is treating agents as a runtime problem, not a prompt pattern. Teams still need to design their data boundaries and workspace policies carefully, but less of the basic orchestration has to be rebuilt for every product.
Related Articles
Open-model competition is shifting from leaderboard scores to agent operating costs. NVIDIA says Nemotron 3 Ultra is a 550B MoE model with 5x faster inference and up to 30% lower cost for complex agentic tasks.
HN interest centered less on “Claude finds bugs” and more on the shape of a harness security teams can adapt for their own targets.
OpenAI made ChatGPT Lockdown Mode available to all logged-in users and added moderation scores to API generation requests on June 4. The changes move prompt-injection and data-exfiltration defenses from policy language into product controls.