Google turns Cloud Next into an agent-platform pitch at 16B TPM
Original: 7 highlights from Google Cloud Next ‘26 View original →
Google used Cloud Next ’26 to make a blunt argument: the AI market is moving beyond model demos and into agent operations. In its Apr. 24 recap, Google said 75% of Cloud customers now use its AI products, 330 customers processed more than one trillion tokens each over the past 12 months, and direct API traffic across its first-party models has climbed to more than 16 billion tokens per minute. Those numbers are less about bragging rights than about positioning. Google wants customers to see it as the place where agents are built, governed and run, not merely the company that ships frontier models.
The centerpiece is the new Gemini Enterprise Agent Platform. Google describes it as an end-to-end workspace that bundles model building and tuning with agent integration, security, and DevOps. The company is also trying to lower the skill floor. Agent Studio, the low-code interface inside the platform, is meant to let developers and business teams build and test agents with natural language instead of stitching together infrastructure by hand. Google says the platform gives access to Gemini 3.1 Pro for complex workflows, Gemini 3.1 Flash Image for image generation, Lyria 3 for audio, and even Anthropic’s Claude Opus 4.7. That last detail matters: open choice is turning into a competitive feature, not a concession.
The scale claims add context. A year ago, cloud AI stories were mostly about pilots. Google is now saying the agentic enterprise is already here, and the stack must include more than models. It must also include data, tooling, security, orchestration, monitoring and distribution inside the company. That is why the Cloud Next recap groups TPUs, security agents, productivity tools and developer platform features into one narrative. Google is not selling a single feature. It is selling a control plane for AI work.
That framing matters because the frontier is shifting. Model quality still decides the shortlist, but it no longer closes the deal on its own. Enterprises want to know how agents connect to internal systems, how they are governed, how fast they can be deployed and which models can be swapped in when cost, latency or compliance changes. Google’s recap reads like an answer to that demand. The headline numbers, especially 16 billion tokens per minute and 330 ultra-heavy customers, show how hard the company is pushing scale. The more important takeaway is strategic: cloud AI is turning into runtime, governance and workflow infrastructure. The source recap is here.
Related Articles
Google unveiled Gemini 3.5 Flash at I/O 2026, combining frontier performance with agentic capabilities. The new Managed Agents API provisions a full sandboxed agent environment with a single API call.
Google’s I/O 2026 AI story is about distribution as much as models. Gemini 3.5 Flash is now generally available across API, Antigravity, Android Studio, enterprise tools, Search, and the Gemini app, while Gemini Omni Flash brings video generation into the same push.
Open-model competition is shifting from leaderboard scores to agent operating costs. NVIDIA says Nemotron 3 Ultra is a 550B MoE model with 5x faster inference and up to 30% lower cost for complex agentic tasks.