Google DeepMind Rolls Out Gemini 3.1 Flash-Lite Preview
Original: Gemini 3.1 Flash-Lite has landed. View original →
On March 3, 2026, Google DeepMind said on X that Gemini 3.1 Flash-Lite is rolling out in preview and is available through the Gemini API in Google AI Studio. In the launch thread, Google described Flash-Lite as the most cost-efficient model in the Gemini 3 series and said it is built for intelligence at scale rather than maximum flagship complexity.
Google DeepMind also used the thread to compare the new model with the previous tier. According to the company, Gemini 3.1 Flash-Lite outperforms Gemini 2.5 Flash while delivering faster performance at a lower price. Google added that new thinking levels let developers tune how much reasoning the model uses for different workloads, which means teams can balance cost, latency, and reasoning depth more directly inside production systems.
The company said Flash-Lite can still handle more complex tasks than a minimal low-cost tier might suggest. Google specifically pointed to workloads such as generating UI, building dashboards, and creating simulations. That combination of lower price, faster speed, and adjustable reasoning makes the model relevant for developers who need high request volume and predictable operating costs without dropping down to a bare-bones utility model.
Google framed the launch as a practical deployment option rather than a frontier showcase. With preview access already live through the Gemini API and Google AI Studio, Flash-Lite gives teams another way to segment workloads by cost and reasoning budget inside the Gemini lineup. The primary source for the launch is Google DeepMind's X thread.
Related Articles
Google AI shared practical Gemini 3.1 Flash-Lite examples, including high-volume image sorting and business automation scenarios. The thread also points developers to preview access via Gemini API, Google AI Studio, and Vertex AI.
Google DeepMind said on March 3, 2026 that Gemini 3.1 Flash-Lite delivers faster performance at a lower price than Gemini 2.5 Flash. Google is rolling the model out in preview via Google AI Studio and Vertex AI for high-volume, latency-sensitive workloads.
Google on March 3, 2026 introduced Gemini 3.1 Flash-Lite as the fastest and most cost-efficient model in the Gemini 3 family. The preview is rolling out through Google AI Studio and Vertex AI at $0.25/1M input tokens and $1.50/1M output tokens.
Comments (0)
No comments yet. Be the first to comment!