OpenAI launches GPT-5.4 mini and nano for coding, tool use, and subagent workloads
Original: Introducing GPT-5.4 mini and nano View original →
OpenAI on March 17 introduced GPT-5.4 mini and GPT-5.4 nano as smaller GPT-5.4 variants built for low-latency workloads. The company said the release targets product surfaces where response speed affects user experience directly, including coding assistants, delegated subagents, computer-use systems that interpret screenshots, and real-time multimodal applications.
In OpenAI's positioning, GPT-5.4 mini is the primary small model. It is described as more than 2x faster than GPT-5 mini while improving on coding, reasoning, multimodal understanding, and tool use. OpenAI also said GPT-5.4 mini approaches the larger GPT-5.4 model on several evaluations. The post lists 54.4% on SWE-Bench Pro, 72.1% on OSWorld-Verified, and 42.9% on Toolathlon. GPT-5.4 nano, meanwhile, is framed as the smallest and cheapest option for classification, data extraction, ranking, and simpler coding subagents.
The release is notable because OpenAI is explicitly designing for mixed-model agent systems rather than only for single-model chat. In the Codex example from the post, a larger model handles planning and final judgment while GPT-5.4 mini subagents take narrower tasks such as codebase search, large-file review, or document processing in parallel. That makes the announcement as much about workflow design as raw benchmark gains.
What ships now
- GPT-5.4 mini is available in the API, Codex, and ChatGPT, with a 400k context window.
- The model supports text and image inputs, tool use, function calling, web search, file search, computer use, and skills.
- OpenAI lists pricing at $0.75 per 1M input tokens and $4.50 per 1M output tokens for GPT-5.4 mini, while GPT-5.4 nano is API-only at $0.20 and $1.25.
The practical takeaway is that OpenAI is trying to make small models good enough for production coding and agent orchestration, not just for fallback chat. The company is positioning GPT-5.4 mini as a high-volume execution tier inside larger agent systems, where latency, quota usage, and cost determine whether a workflow is usable at scale.
Related Articles
OpenAI said on X that GPT-5.4 mini is rolling out in ChatGPT, Codex, and the API, while GPT-5.4 nano is aimed at lower-cost API workloads. The company is positioning the pair as faster small models for coding, multimodal tasks, and agent sub-workflows.
OpenAI Developers said on X that GPT-5.4 mini and nano are now part of the GPT-5.4 family for developer workflows. OpenAI positions mini as a faster coding and tool-use model for API, Codex, and ChatGPT, while nano is the lowest-cost option for lighter API workloads.
OpenAI said on March 5, 2026 that GPT-5.4 Thinking shows low Chain-of-Thought controllability, which for now strengthens CoT monitoring as a safety signal. The release pairs an X post with a new open-source evaluation suite and research paper.
Comments (0)
No comments yet. Be the first to comment!