AI sources.twitter 4h ago 1 min read
Google DeepMind’s new training stack matters because datacenter boundaries are turning into frontier bottlenecks. Decoupled DiLoCo trained a 12B Gemma model across four U.S. regions on 2-5 Gbps links, more than 20x faster than conventional synchronization while holding 64.1% average accuracy versus a 64.4% baseline.