Google DeepMind Rolls Out Gemini 3.1 Flash-Lite Preview
Original: Gemini 3.1 Flash-Lite has landed. View original →
On March 3, 2026, Google DeepMind said on X that Gemini 3.1 Flash-Lite is rolling out in preview and is available through the Gemini API in Google AI Studio. In the launch thread, Google described Flash-Lite as the most cost-efficient model in the Gemini 3 series and said it is built for intelligence at scale rather than maximum flagship complexity.
Google DeepMind also used the thread to compare the new model with the previous tier. According to the company, Gemini 3.1 Flash-Lite outperforms Gemini 2.5 Flash while delivering faster performance at a lower price. Google added that new thinking levels let developers tune how much reasoning the model uses for different workloads, which means teams can balance cost, latency, and reasoning depth more directly inside production systems.
The company said Flash-Lite can still handle more complex tasks than a minimal low-cost tier might suggest. Google specifically pointed to workloads such as generating UI, building dashboards, and creating simulations. That combination of lower price, faster speed, and adjustable reasoning makes the model relevant for developers who need high request volume and predictable operating costs without dropping down to a bare-bones utility model.
Google framed the launch as a practical deployment option rather than a frontier showcase. With preview access already live through the Gemini API and Google AI Studio, Flash-Lite gives teams another way to segment workloads by cost and reasoning budget inside the Gemini lineup. The primary source for the launch is Google DeepMind's X thread.
Related Articles
Google introduced Project Spend Caps, revamped Usage Tiers, and new billing dashboards for Gemini API developers in AI Studio. The update is aimed at making cost control and scaling behavior more predictable for teams moving into paid usage.
Google AI shared practical Gemini 3.1 Flash-Lite examples, including high-volume image sorting and business automation scenarios. The thread also points developers to preview access via Gemini API, Google AI Studio, and Vertex AI.
Google introduced Gemini 3.1 Flash-Lite on March 3, 2026 as its fastest and most cost-efficient Gemini 3 series model. The model is rolling out in preview through the Gemini API in Google AI Studio and Vertex AI, with pricing of $0.25/1M input tokens and $1.50/1M output tokens, plus claims of a 2.5x faster Time to First Answer Token and 45% higher output speed than 2.5 Flash.
Comments (0)
No comments yet. Be the first to comment!