OpenAI Agents SDK, sandbox로 장기 실행 agent를 제품화하기 쉽게 했다

Agent 제품화의 병목은 model만이 아니다. 실제 서비스에서는 agent가 파일을 읽고, command를 실행하고, code를 수정하고, 중간 상태를 잃지 않고 오래 일할 수 있어야 한다. OpenAI는 4월 15일 공개한 Agents SDK update에서 이 실행 loop를 SDK의 기본 기능으로 끌어올렸다.

새 SDK의 핵심은 model-native harness와 native sandbox execution이다. Harness는 file, document, system을 다루는 agent를 위해 configurable memory, sandbox-aware orchestration, Codex-like filesystem tools를 제공한다. OpenAI는 MCP, skills, AGENTS.md, shell tool, apply patch tool 같은 agentic primitive를 표준 구성요소로 묶어, 개발자가 매번 자체 framework를 짜는 일을 줄이겠다는 방향을 분명히 했다.

Sandbox 지원은 더 실용적인 변화다. Agent는 controlled computer environment 안에서 필요한 file, tool, dependency를 갖고 작업할 수 있다. 개발자는 자체 sandbox를 가져오거나 Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, Vercel 지원을 사용할 수 있다. Manifest abstraction은 local file을 mount하고 output directory를 정의하며 AWS S3, Google Cloud Storage, Azure Blob Storage, Cloudflare R2의 data를 작업 공간에 넣는 방식을 통일한다.

보안 설계도 제품화 관점에서 중요하다. OpenAI는 prompt-injection과 exfiltration attempt를 전제로 agent system을 설계해야 한다고 설명한다. Harness와 compute를 분리하면 model-generated code가 실행되는 환경에 credential을 직접 넣지 않을 수 있다. Snapshotting과 rehydration은 sandbox container가 사라져도 agent state를 새 container에서 이어받게 해 long-running task의 실패 비용을 낮춘다.

이번 기능은 API 고객에게 GA로 제공되고, standard API pricing에 따라 token과 tool use 기준으로 과금된다. 출시 초점은 Python이며 TypeScript support는 이후 계획이다. 중요한 점은 OpenAI가 agent를 단순한 model call이 아니라 file system, sandbox, tool orchestration을 포함한 runtime 문제로 다루기 시작했다는 것이다. 개발자는 더 적은 custom infrastructure로 production agent를 만들 수 있지만, 그만큼 workspace policy와 data boundary를 명시적으로 설계해야 한다.

OpenAI Agents SDK, sandbox로 장기 실행 agent를 제품화하기 쉽게 했다

Related Articles

ChatGPT Lockdown Mode 전면 적용… prompt injection 방어가 기본 논점으로

Google Agentic RAG, 답 못 찾는 검색을 34% 정확도 개선으로

Anthropic vuln harness, 제품보다 실험대에 가까운 이유

Related Articles

ChatGPT Lockdown Mode 전면 적용… prompt injection 방어가 기본 논점으로

Google Agentic RAG, 답 못 찾는 검색을 34% 정확도 개선으로

Anthropic vuln harness, 제품보다 실험대에 가까운 이유