Skip to content
Decaying

Google Cloud brings autonomous embedding generation to BigQuery in preview

Original: Say goodbye to manual vector management! BigQuery’s new autonomous embedding generation (preview) automatically syncs your data with Vertex AI models. Glance the guide ↓ https://goo.gle/422Inow View original →

Read in other languages: 한국어日本語
LLM Apr 10, 2026 By Insights AI 2 min read 47 views Source

What Google Cloud highlighted

On April 10, 2026, Google Cloud Tech used X to point developers to BigQuery’s autonomous embedding generation preview. The idea is simple but useful: instead of maintaining a separate pipeline to create and refresh vectors, BigQuery can keep an embedding column synchronized with a source text column automatically.

The documentation describes this as a generated-column pattern. When data is inserted or modified in the source column, BigQuery automatically generates or updates the embedding column in the background. That removes one of the most common operational pain points in retrieval systems: vectors going stale because the application data changed but the embedding job did not run.

How it works

The guide shows autonomous embedding generation built around the AI.EMBED function inside a CREATE TABLE or ALTER TABLE statement. BigQuery stores the generated embeddings asynchronously and can then use those values for search and indexing workflows.

  • A table can define a source STRING column and an automatically generated embedding column.
  • BigQuery can call a Vertex AI embedding model such as text-embedding-005 to maintain that column.
  • After embeddings exist, teams can create a vector index and query it with AI.SEARCH.

Why the preview matters

The most interesting detail is the split between external and built-in model paths. The docs say BigQuery can use Vertex AI-hosted embedding models, but they also describe a preview option for the built-in embeddinggemma-300m model. When that built-in path is used, Google says the data stays in BigQuery and no Vertex AI charges are incurred. That is a notable design choice because it turns embedding freshness into more of a data platform feature than a separate ML ops task.

The guide also makes the governance story explicit. Teams still need the right BigQuery permissions, a connection resource when calling Vertex AI endpoints, and the appropriate Vertex AI user role for the connection’s service account. So the feature removes glue code, not operational discipline.

Why this is high-signal

Vector search has been one of the messier parts of production AI systems because every team ends up building some version of “change source data, detect the change, regenerate embeddings, store the result, re-index the table.” Autonomous embedding generation compresses that pattern into the database layer itself. For enterprise teams already living in BigQuery, that could make RAG, semantic search, and similarity-based workflows easier to keep current and harder to break through pipeline drift.

Sources: Google Cloud Tech X post · BigQuery documentation

Share: Long

Related Articles

LLM Apr 25, 2026 2 min read

Google says its AI business has crossed from pilots to operations: 75% of Cloud customers now use AI products, 330 customers processed more than 1 trillion tokens each in the past year, and model traffic exceeds 16 billion tokens per minute. The company used Cloud Next ’26 to turn that scale into a product pitch for Gemini Enterprise Agent Platform, a full runtime and governance layer for enterprise agents.

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment