Google shifts more ML spend to Cloud as Gemini moves from pilots to fleets
Original: Cloud Next ‘26: Momentum and innovation at Google scale View original →
Google used Cloud Next 2026 to make a broader point than a product launch. Enterprise AI is leaving the pilot phase and entering fleet management. In Sundar Pichai's April 22 post, Google said its first-party models now process more than 16 billion tokens per minute via direct API use by customers, up from 10 billion last quarter. Just over half of Google's overall machine learning compute investment in 2026 is expected to go to the Cloud business.
The company also said paid monthly active users for Gemini Enterprise grew 40% quarter over quarter in Q1. Google framed the next problem as orchestration, not access. The question for customers is no longer whether they can build one agent. It is how to manage thousands. That is why Google introduced a new Gemini Enterprise Agent Platform, which it describes as the secure full-stack layer to build, govern, scale, and optimize agents.
Infrastructure is part of the pitch. Google unveiled two eighth-generation TPU lines: TPU 8t for training, which it says can scale to 9,600 TPUs and 2 petabytes of shared high-bandwidth memory in a single superpod, and TPU 8i for inference, which connects 1,152 TPUs in a pod and adds 3x more on-chip SRAM to handle lower-latency, higher-throughput agent workloads. Google says TPU 8t delivers 3x the processing power of Ironwood and up to 2x more performance per watt.
Security is the other half. Google said its Security Operations Center agents now triage tens of thousands of unstructured threat reports each month and cut threat mitigation time by more than 90%. Inside the company, it said 75% of all new code is now AI-generated and approved by engineers, up from 50% last fall. Those numbers are meant to show Google using the same agentic tooling it wants customers to buy.
The competitive message is clear. Google wants to sell not just models, but the control plane, security stack, and compute fabric for organizations that expect agents to run across real systems. The Cloud Next metrics suggest that the market is starting to reward vendors that can operate at that scale, not only demo it.
Related Articles
Microsoft said it will invest more than US$1 billion in Thailand’s cloud and AI infrastructure from 2026 to 2028. The company paired the infrastructure commitment with regulatory engagement, an e-commerce generative AI feasibility study, and workforce and startup collaboration.
TNW reports that Google is discussing two AI chips with Marvell: a memory processing unit and an inference-focused TPU. No contract is signed yet, but the talks show how serving models, not just training them, is driving custom silicon strategy.
Google said on March 11, 2026 that it has completed the Wiz acquisition and will bring the company into Google Cloud while keeping Wiz available across all major clouds. The company is positioning the deal as a multicloud and AI security move as attackers increasingly use AI to scale threats.
Comments (0)
No comments yet. Be the first to comment!