Google DeepMind brings Gemini Embedding 2 to preview for multimodal retrieval

Original: Google DeepMind launches Gemini Embedding 2 in preview View original →

Read in other languages: 한국어日本語
LLM Mar 17, 2026 By Insights AI 2 min read 1 views Source

Google DeepMind said on X on March 10, 2026 that Gemini Embedding 2 is now available in preview through the Gemini API and Vertex AI. The company describes it as the first fully multimodal embedding model built on the Gemini architecture, designed to map text, images, video, audio, and documents into a shared vector space.

That description is more significant than it may sound. Many production retrieval systems still rely on separate models for text search, image search, document indexing, and media understanding. A genuinely multimodal embedding layer can simplify the stack by letting teams store and compare different content types in one representation. That matters for enterprise search, recommendation systems, multimodal RAG, and any workflow where users mix screenshots, PDFs, voice notes, or clips with text queries.

Google says the model supports more than 100 languages and can process mixed inputs rather than only one modality at a time. In the launch materials, the company also highlights support for up to 8,192 text tokens, up to 6 images per request, short video and audio inputs, and PDF documents. It also exposes multiple output sizes, including 3,072, 1,536, and 768 dimensions, using Matryoshka Representation Learning so teams can trade retrieval quality against storage and serving cost.

The competitive context is also notable. Embeddings rarely get the same attention as flagship chat models, but they quietly determine how much real-world information a system can retrieve and rank before any generation step begins. By pushing a fully multimodal embedding model into preview, Google DeepMind is moving the Gemini family deeper into the infrastructure layer that powers search, knowledge systems, and agent memory.

For developers, the practical takeaway is straightforward: if Gemini Embedding 2 performs well in production, it could reduce the number of specialized vector pipelines they need to maintain. That can lower system complexity, make multimodal retrieval more natural, and give Google a stronger position in the part of the AI stack that sits underneath assistants, copilots, and enterprise knowledge tools.

Share: Long

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.