OpenAI launches GPT-5.4 mini and nano for coding, tool use, and subagent workloads

OpenAI on March 17 introduced GPT-5.4 mini and GPT-5.4 nano as smaller GPT-5.4 variants built for low-latency workloads. The company said the release targets product surfaces where response speed affects user experience directly, including coding assistants, delegated subagents, computer-use systems that interpret screenshots, and real-time multimodal applications.

In OpenAI's positioning, GPT-5.4 mini is the primary small model. It is described as more than 2x faster than GPT-5 mini while improving on coding, reasoning, multimodal understanding, and tool use. OpenAI also said GPT-5.4 mini approaches the larger GPT-5.4 model on several evaluations. The post lists 54.4% on SWE-Bench Pro, 72.1% on OSWorld-Verified, and 42.9% on Toolathlon. GPT-5.4 nano, meanwhile, is framed as the smallest and cheapest option for classification, data extraction, ranking, and simpler coding subagents.

The release is notable because OpenAI is explicitly designing for mixed-model agent systems rather than only for single-model chat. In the Codex example from the post, a larger model handles planning and final judgment while GPT-5.4 mini subagents take narrower tasks such as codebase search, large-file review, or document processing in parallel. That makes the announcement as much about workflow design as raw benchmark gains.

What ships now

GPT-5.4 mini is available in the API, Codex, and ChatGPT, with a 400k context window.
The model supports text and image inputs, tool use, function calling, web search, file search, computer use, and skills.
OpenAI lists pricing at $0.75 per 1M input tokens and $4.50 per 1M output tokens for GPT-5.4 mini, while GPT-5.4 nano is API-only at $0.20 and $1.25.

The practical takeaway is that OpenAI is trying to make small models good enough for production coding and agent orchestration, not just for fallback chat. The company is positioning GPT-5.4 mini as a high-volume execution tier inside larger agent systems, where latency, quota usage, and cost determine whether a workflow is usable at scale.

OpenAI launches GPT-5.4 mini and nano for coding, tool use, and subagent workloads

What ships now

Related Articles

OpenAI launches GPT-5.4 mini and nano for faster coding and subagent workloads

OpenAI broadens its developer model stack with GPT-5.4 mini and nano

Gemma 4 12B puts the spotlight on encoder-free multimodal local AI