Google DeepMind, Gemini Embedding 2 preview 공개로 multimodal retrieval 확장

Google DeepMind는 2026년 3월 10일 X를 통해 Gemini Embedding 2를 Gemini API와 Vertex AI에서 preview로 제공한다고 밝혔다. 회사는 이 모델을 Gemini architecture 위에서 구축된 첫 fully multimodal embedding model로 설명하며, text, images, video, audio, documents를 하나의 shared vector space에 매핑하도록 설계했다고 말했다.

이 설명이 중요한 이유는 실제 production retrieval 시스템이 여전히 text search, image search, document indexing, media understanding을 서로 다른 모델에 나눠 맡기는 경우가 많기 때문이다. truly multimodal embedding layer가 자리 잡으면 서로 다른 콘텐츠를 같은 표현 공간에서 저장하고 비교할 수 있어 스택이 단순해진다. 이는 enterprise search, recommendation systems, multimodal RAG, 그리고 사용자가 screenshot, PDF, voice note, clip을 text query와 함께 섞어 쓰는 워크플로우에 직접적인 영향을 준다.

Google은 이 모델이 100개 이상의 언어를 지원하며, 단일 modality가 아니라 mixed inputs도 처리할 수 있다고 설명했다. 발표 자료에서는 최대 8,192 text tokens, 요청당 최대 6장의 images, 짧은 video와 audio 입력, PDF documents 지원도 강조했다. 또한 Matryoshka Representation Learning을 통해 3,072·1,536·768 dimensions 출력 옵션을 제공해, 팀이 retrieval 품질과 저장·서빙 비용 사이에서 균형을 맞출 수 있게 했다.

경쟁 구도에서도 의미가 있다. embeddings는 flagship chat model만큼 주목받지 못하지만, 실제로는 generation 단계 전에 시스템이 어떤 정보를 얼마나 잘 찾아오고 정렬하는지를 결정한다. Google DeepMind가 fully multimodal embedding model을 preview로 내놓았다는 것은 Gemini 계열을 search, knowledge systems, agent memory를 떠받치는 인프라 계층까지 더 깊게 확장하겠다는 뜻에 가깝다.

개발자 관점의 핵심은 분명하다. Gemini Embedding 2가 production 환경에서도 성능을 입증한다면, 유지해야 할 specialized vector pipelines 수를 줄이고 multimodal retrieval를 더 자연스럽게 구현할 수 있다. 이는 assistants, copilots, enterprise knowledge tools 아래에서 실제로 작동하는 AI 스택의 기반을 Google이 더 강하게 장악할 수 있음을 의미한다.

Google DeepMind, Gemini Embedding 2 preview 공개로 multimodal retrieval 확장

Related Articles

Google Cloud, 분당 160억 토큰 시대… 승부수는 모델 아닌 에이전트 스택

Google, 첫 네이티브 멀티모달 embedding 모델 Gemini Embedding 2 preview 공개

Google DeepMind, Gemma 4 공개…agentic workflow와 multimodal local AI 겨냥

Comments (0)

Leave a Comment

Related Articles

Google Cloud, 분당 160억 토큰 시대… 승부수는 모델 아닌 에이전트 스택

Google, 첫 네이티브 멀티모달 embedding 모델 Gemini Embedding 2 preview 공개
LLM Mar 13, 2026 1 min read

Google DeepMind, Gemma 4 공개…agentic workflow와 multimodal local AI 겨냥
LLM Hacker News Apr 2, 2026 1 min read