Google says its AI business has crossed from pilots to operations: 75% of Cloud customers now use AI products, 330 customers processed more than 1 trillion tokens each in the past year, and model traffic exceeds 16 billion tokens per minute. The company used Cloud Next ’26 to turn that scale into a product pitch for Gemini Enterprise Agent Platform, a full runtime and governance layer for enterprise agents.
#vertex-ai
RSS FeedWhy it matters: Google is turning Vertex AI from a collection of services into a governed agent platform. The linked Google Cloud post says Model Garden gives access to more than 200 models, including Gemini 3.1 Pro, Lyria 3, Gemma 4, and Claude families.
Google Cloud Tech highlighted BigQuery’s autonomous embedding generation preview on April 10, 2026, positioning it as a way to keep vector data in sync without separate ETL glue. The documentation shows automatically maintained embedding columns backed by Vertex AI models, plus a preview built-in model path inside BigQuery.
Google AI said on March 25, 2026 that Lyria 3 Pro is now available across a broad mix of consumer, developer, and enterprise surfaces. The rollout suggests Google wants music generation to become part of its mainstream AI stack rather than a standalone experiment.
Google said on March 25, 2026 that Lyria 3 Pro can generate tracks up to three minutes long and handle explicit musical structures such as intros, verses, choruses, and bridges. The official blog says the model is also expanding into Vertex AI public preview, AI Studio, Google Vids, and Music AI Sandbox workflows.
Google introduced Gemini 3.1 Flash-Lite on Mar 03, 2026 as its fastest and lowest-cost Gemini 3 series model. The preview release targets high-volume developer workloads with lower pricing, faster latency, and stronger benchmark scores than the prior 2.5 Flash tier.
Google DeepMind said on X that Gemini Embedding 2 is now in preview through the Gemini API and Vertex AI. The model is positioned as the first fully multimodal embedding model built on the Gemini architecture, aiming to unify retrieval across text, images, video, audio, and documents.
Google on March 3, 2026 introduced Gemini 3.1 Flash-Lite as the fastest and most cost-efficient model in the Gemini 3 family. The preview is rolling out through Google AI Studio and Vertex AI at $0.25/1M input tokens and $1.50/1M output tokens.
Google AI Developers says Gemini Embedding 2 is now in preview via the Gemini API and Vertex AI. Google describes it as its first fully multimodal embedding model on the Gemini architecture and its most capable embedding model so far.
Google AI used X on March 6, 2026 to direct developers to Nano Banana 2, saying the model is available through the Gemini API in Google AI Studio and Vertex AI. Google’s linked post positions Nano Banana 2, or Gemini 3.1 Flash Image, as a high-quality and faster image model designed for real application workloads.
Google DeepMind said on March 3, 2026 that Gemini 3.1 Flash-Lite delivers faster performance at a lower price than Gemini 2.5 Flash. Google is rolling the model out in preview via Google AI Studio and Vertex AI for high-volume, latency-sensitive workloads.
Google introduced Gemini 3.1 Flash-Lite on March 3, 2026 as its fastest and most cost-efficient model in the Gemini 3 family. The model ships in Google AI Studio and Vertex AI with a 1 million-token context window and API-level reasoning budget controls.