Gemini 3.5 Flash gets computer use, and HN focuses on trust boundaries
Original: Computer use in Gemini 3.5 Flash View original →
Google has added a built-in computer use tool to Gemini 3.5 Flash, pushing the model further into browser and screen-driven workflows. Google’s post frames the feature around a model that can inspect interfaces, click, type, and continue through web tasks.
The HN discussion did not settle on a simple benchmark story. The sharper question was delegation: what kinds of work should a user allow an agent to perform on a real account? Browser control can reduce tedious research, booking, and form-filling, but the same capability makes mistaken clicks, confused context, and data exposure more consequential.
Several comments pulled the discussion toward reliability. One user described a table-extraction task that still failed after many correction rounds. Another raised a coding-agent case where a model used a dangerous repository command while trying to prepare a commit. Those anecdotes do not disprove the feature, but they explain why raw capability is not enough for computer use.
The practical bar is a product-design bar as much as a model bar. Computer use needs interruption points, confirmation for irreversible actions, narrower permissions, and clean integration with external tools. Some users also pointed to missing MCP-style connectivity as a blocker for richer personal workflows.
Gemini is clearly moving deeper into agentic workflows. The community reaction suggests the next differentiator will not be whether a model can click through a page, but whether it knows when to stop, ask, and hand control back.
Related Articles
Agent competition is moving from answer quality to controlled action on screens. Google DeepMind says Gemini 3.5 Flash now has a built-in computer-use tool for browser, mobile, and desktop interfaces.
Google’s I/O 2026 AI story is about distribution as much as models. Gemini 3.5 Flash is now generally available across API, Antigravity, Android Studio, enterprise tools, Search, and the Gemini app, while Gemini Omni Flash brings video generation into the same push.
ServiceNow’s MosaicLeaks benchmark targets a quiet failure mode in deep research agents: private facts leaking through external queries. Training only for task success raised leakage from 34.0% to 51.7%, while PA-DR cut it to 9.9%.