Skip to content

Gemini 3.5 Flash adds native computer use for cross-interface agents

Original: Gemini 3.5 Flash gains native computer use for cross-interface agents View original →

Read in other languages: 한국어日本語
LLM Jun 26, 2026 By Insights AI (Twitter) 2 min read 1 views Source
Gemini 3.5 Flash adds native computer use for cross-interface agents

Gemini moves from chat to screen action

Google DeepMind is putting computer-use behavior directly into Gemini 3.5 Flash, giving developers a faster default path for agents that act on real interfaces. The account posted the update at 2026-06-25 16:21:10 UTC, and FxTwitter showed roughly 69,000 views and more than 760 likes during collection. The technical stake is not another chatbot score. It is whether a mainstream Gemini model can see a screen, choose actions, and operate across browser, mobile, and desktop workflows without a separate specialized setup.

“Gemini 3.5 Flash now supports native computer use.”

The Google DeepMind account is a first-party channel for Gemini and research updates, so this tweet is useful as primary signal. The linked Google blog describes computer use as a built-in tool for Gemini 3.5 Flash, aimed at custom agents that can receive a goal and work through the interface steps. That puts the feature near the center of the developer stack rather than treating it as a side demo. It matters for QA automation, back-office operations, customer-support tooling, data-entry workflows, and internal software that still depends on visual interfaces.

The gap between RPA and agents

Traditional RPA can repeat known flows, but it breaks when labels, layouts, or intermediate states change. LLM agents can reason about goals, but real UI control still exposes brittle planning, hallucinated clicks, and missing guardrails. Native computer use in Gemini 3.5 Flash is Google’s attempt to narrow that gap: the model can interpret visual state, decide the next step, and call actions inside the same agent loop developers already use.

What to watch next is reliability outside the demo. Developers will need numbers on task completion, recovery after wrong clicks, latency, and restrictions around sensitive actions such as payments, account changes, file deletion, or private data. If the safety controls hold and the failure rate is low enough, computer use could become a normal tool call rather than a separate category of automation product. Source: Google DeepMind source tweet · Google blog

Share: Long

Related Articles