Google DeepMind brings Gemini Embedding 2 to preview for multimodal retrieval

Google DeepMind said on X on March 10, 2026 that Gemini Embedding 2 is now available in preview through the Gemini API and Vertex AI. The company describes it as the first fully multimodal embedding model built on the Gemini architecture, designed to map text, images, video, audio, and documents into a shared vector space.

That description is more significant than it may sound. Many production retrieval systems still rely on separate models for text search, image search, document indexing, and media understanding. A genuinely multimodal embedding layer can simplify the stack by letting teams store and compare different content types in one representation. That matters for enterprise search, recommendation systems, multimodal RAG, and any workflow where users mix screenshots, PDFs, voice notes, or clips with text queries.

Google says the model supports more than 100 languages and can process mixed inputs rather than only one modality at a time. In the launch materials, the company also highlights support for up to 8,192 text tokens, up to 6 images per request, short video and audio inputs, and PDF documents. It also exposes multiple output sizes, including 3,072, 1,536, and 768 dimensions, using Matryoshka Representation Learning so teams can trade retrieval quality against storage and serving cost.

The competitive context is also notable. Embeddings rarely get the same attention as flagship chat models, but they quietly determine how much real-world information a system can retrieve and rank before any generation step begins. By pushing a fully multimodal embedding model into preview, Google DeepMind is moving the Gemini family deeper into the infrastructure layer that powers search, knowledge systems, and agent memory.

For developers, the practical takeaway is straightforward: if Gemini Embedding 2 performs well in production, it could reduce the number of specialized vector pipelines they need to maintain. That can lower system complexity, make multimodal retrieval more natural, and give Google a stronger position in the part of the AI stack that sits underneath assistants, copilots, and enterprise knowledge tools.

Google DeepMind brings Gemini Embedding 2 to preview for multimodal retrieval

Related Articles

Google DeepMind Launches Gemini 3.1 Pro for Complex Reasoning Workloads

Google opens Gemini Embedding 2 preview as its first natively multimodal embedding model

Google launches Gemini Embedding 2 for unified text, image, audio, video, and document search

Related Articles

Google DeepMind Launches Gemini 3.1 Pro for Complex Reasoning Workloads
LLM Feb 28, 2026 2 min read

Google opens Gemini Embedding 2 preview as its first natively multimodal embedding model
LLM Mar 13, 2026 2 min read

Google launches Gemini Embedding 2 for unified text, image, audio, video, and document search
LLM X/Twitter Mar 22, 2026 2 min read