Google-Marvell talks show inference is now the chip fight
Original: Google is in talks with Marvell to build custom AI inference chips as it diversifies beyond Broadcom View original →
TNW, citing The Information, reports that Google is in talks with Marvell Technology to develop two chips for running AI models more efficiently. One is a memory processing unit designed to work with Google's existing Tensor Processing Units. The other is a new TPU built specifically for inference, the phase where models serve users rather than learn from training data. The discussions have not yet produced a signed contract.
The important point is that this is not a clean replacement story. Broadcom, Google's primary custom chip partner, had just secured a long-term agreement to design and supply TPUs and networking components through 2031. TNW frames the Marvell talks as part of a wider multi-supplier architecture that also includes MediaTek and TSMC, with different partners handling different cost and performance segments.
The center of gravity is inference. Training a frontier model can consume immense compute for weeks or months, but it is still a bounded event. Inference runs continuously, answering every query from every user. As Google pushes AI deeper into Search, Gemini, and Cloud AI APIs, shaving cost from each inference call can matter at the scale of billions of requests.
Google has already positioned Ironwood, its seventh-generation TPU, for this shift. TNW says Ironwood delivers ten times the peak performance of TPU v5p and can scale to a 9,216-chip liquid-cooled superpod producing 42.5 FP8 exaflops. A Marvell-designed inference chip would likely supplement that roadmap rather than displace it, giving Google more silicon options for different workload profiles and price points.
Marvell has reason to be in the conversation. Its custom silicon work spans Amazon's Trainium processors, Microsoft's Maia AI accelerator, Meta data processing units, and Google's Axion ARM CPU. Nvidia's recent $2 billion investment and Marvell's Celestial AI acquisition also put the company closer to both GPU and ASIC ecosystems. For Google, the strategic prize is clear: reduce dependence on any single supplier while tuning AI infrastructure around the cost curve that now matters most.
Related Articles
A video believed to be from Google's unreleased 'Omni' video generation model has leaked, drawing 1,300+ upvotes on r/singularity. Users particularly noted the model's unusually coherent text rendering - a persistent weakness in current video generation models.
Google's Threat Intelligence Group detected the first confirmed AI-authored zero-day exploit in the wild—a Python script bypassing two-factor authentication in a popular open-source web admin tool, intercepted before criminals could launch a mass exploitation campaign.
Google will pay SpaceX $920M per month from October 2026 through June 2029 for access to about 110,000 NVIDIA GPUs and related compute. The deal shows how fast AI demand can pressure even one of the world’s largest infrastructure operators.