Google turns Cloud Next into an agent-platform pitch at 16B TPM
Original: 7 highlights from Google Cloud Next ‘26 View original →
Google used Cloud Next ’26 to make a blunt argument: the AI market is moving beyond model demos and into agent operations. In its Apr. 24 recap, Google said 75% of Cloud customers now use its AI products, 330 customers processed more than one trillion tokens each over the past 12 months, and direct API traffic across its first-party models has climbed to more than 16 billion tokens per minute. Those numbers are less about bragging rights than about positioning. Google wants customers to see it as the place where agents are built, governed and run, not merely the company that ships frontier models.
The centerpiece is the new Gemini Enterprise Agent Platform. Google describes it as an end-to-end workspace that bundles model building and tuning with agent integration, security, and DevOps. The company is also trying to lower the skill floor. Agent Studio, the low-code interface inside the platform, is meant to let developers and business teams build and test agents with natural language instead of stitching together infrastructure by hand. Google says the platform gives access to Gemini 3.1 Pro for complex workflows, Gemini 3.1 Flash Image for image generation, Lyria 3 for audio, and even Anthropic’s Claude Opus 4.7. That last detail matters: open choice is turning into a competitive feature, not a concession.
The scale claims add context. A year ago, cloud AI stories were mostly about pilots. Google is now saying the agentic enterprise is already here, and the stack must include more than models. It must also include data, tooling, security, orchestration, monitoring and distribution inside the company. That is why the Cloud Next recap groups TPUs, security agents, productivity tools and developer platform features into one narrative. Google is not selling a single feature. It is selling a control plane for AI work.
That framing matters because the frontier is shifting. Model quality still decides the shortlist, but it no longer closes the deal on its own. Enterprises want to know how agents connect to internal systems, how they are governed, how fast they can be deployed and which models can be swapped in when cost, latency or compliance changes. Google’s recap reads like an answer to that demand. The headline numbers, especially 16 billion tokens per minute and 330 ultra-heavy customers, show how hard the company is pushing scale. The more important takeaway is strategic: cloud AI is turning into runtime, governance and workflow infrastructure. The source recap is here.
Related Articles
Google I/O 2026의 핵심은 Gemini를 앱 안의 챗봇보다 넓은 실행 계층으로 밀어 올리는 흐름이다. Gemini 3.5 Flash는 API와 Antigravity, Search, Gemini app에 풀렸고, Gemini Omni는 video 생성과 편집을 전면에 세웠다.
Google이 I/O 2026에서 Gemini 3.5 Flash를 공개하고 Managed Agents API를 발표했다. API 호출 한 번으로 격리 Linux 환경의 완전한 에이전트를 프로비저닝할 수 있다.
기업 RAG의 약점은 답을 모르는 것이 아니라, 필요한 근거가 다른 저장소에 흩어졌을 때 너무 일찍 멈추는 데 있다. Google Research는 충분한 문맥을 검사하고 다시 검색하는 Agentic RAG로 factuality 데이터셋 정확도를 최대 34% 높였다고 밝혔다.