Google DeepMind, Gemini 3.1 Flash-Lite 프리뷰 출시

2026년 3월 3일 Google DeepMind는 X를 통해 Gemini 3.1 Flash-Lite가 preview로 rollout되며 Gemini API와 Google AI Studio에서 사용할 수 있다고 밝혔다. 출시 스레드에서 Google은 Flash-Lite를 Gemini 3 시리즈 중 가장 cost-efficient한 모델로 소개했고, 최고 성능 과시용 flagship tier보다는 intelligence at scale에 맞춘 모델이라고 설명했다.

Google DeepMind는 새 모델을 이전 tier와도 비교했다. 회사 설명에 따르면 Gemini 3.1 Flash-Lite는 Gemini 2.5 Flash보다 더 낮은 가격과 더 빠른 성능으로도 우수한 결과를 낸다. 또한 새로운 thinking levels를 통해 workload별로 reasoning 양을 조절할 수 있어, 개발팀이 production system 안에서 cost, latency, reasoning depth를 더 직접적으로 맞출 수 있다고 했다.

회사 측은 Flash-Lite가 단순한 초저가 model보다 더 복잡한 작업도 처리할 수 있다고 강조했다. 예시로는 UI 생성, dashboard 구축, simulation 생성이 제시됐다. 낮은 가격, 빠른 속도, 조절 가능한 reasoning을 함께 제공한다는 점 때문에, 많은 요청량과 예측 가능한 운영비가 필요한 개발자에게 의미 있는 선택지로 보인다.

Google은 이번 출시를 frontier showcase라기보다 실전 배포용 옵션으로 설명했다. preview access가 이미 Gemini API와 Google AI Studio에서 열려 있는 만큼, Flash-Lite는 Gemini 라인업 안에서 workload를 cost와 reasoning budget 기준으로 더 세밀하게 나누려는 팀에게 새로운 선택지를 제공한다. 주요 원문은 Google DeepMind의 X 스레드다.

Google DeepMind, Gemini 3.1 Flash-Lite 프리뷰 출시

Related Articles

Google, Gemini API 비용 통제 강화… AI Studio에 monthly spend caps·자동 tier 업그레이드 도입

Google AI, Gemini 3.1 Flash-Lite의 대규모 멀티모달 활용 사례 공개

Gemini 3.6 Flash의 진짜 변화, 더 싼 agent 실행 비용

Related Articles

Google, Gemini API 비용 통제 강화… AI Studio에 monthly spend caps·자동 tier 업그레이드 도입
LLM Mar 17, 2026 2 min read

Google AI, Gemini 3.1 Flash-Lite의 대규모 멀티모달 활용 사례 공개
LLM X/Twitter Mar 6, 2026 1 min read

Gemini 3.6 Flash의 진짜 변화, 더 싼 agent 실행 비용
Google의 새 Gemini Flash 라인업에서 관심은 모델 이름보다 토큰 효율과 agent workflow 비용에 모였다. 3.6 Flash는 3.5 Flash보다 출력 토큰을 17% 줄였고, Cyber 모델은 CodeMender와 묶였다.