Google Research is turning enterprise RAG into an iterative agent workflow, not a one-shot retrieval step. Its sufficient-context check lifted factuality accuracy by up to 34% and reached 90.1% accuracy in a cross-corpus FramesQA setup.
#rag
RSS FeedGoogle has updated the Gemini API File Search tool to support multimodal content including images, audio, and video, making it easier for developers to build efficient, verifiable RAG systems.
Hacker News liked the promise of model-agnostic memory, but the real energy in the thread came from one immediate question: how does this avoid context pollution? Skepticism arrived faster than praise.
Cloudflare is turning AutoRAG into AI Search, a retrieval primitive agents can create and query from Workers. The open beta adds BM25 plus vector search, built-in storage and index, metadata boosting, and cross-instance search with concrete free and paid limits.
MongoDB said on March 20, 2026 that Heidi’s AI scribe has scaled to 81 million clinical consultations across 190+ countries in 18 months. The linked case study argues Atlas and Atlas Vector Search let Heidi consolidate heterogeneous medical data, run RAG, and scale without downtime in healthcare settings.
Mintlify says chunked RAG was too limited for docs exploration, so it built ChromaFs, a virtual filesystem over Chroma that cuts assistant session creation from about 46 seconds to about 100ms. HN readers were notably receptive to the filesystem-first design and the argument that agent tooling benefits from interpretable, UNIX-like retrieval.
A detailed engineering write-up resonated on Hacker News because it treated production RAG as a data and operations problem, not a prompt demo.
A Hacker News thread around Skylar Payne's DSPy post argues that teams often rebuild DSPy-style LLM engineering patterns as systems mature, even though unfamiliar abstractions, Python fit, and eval design still slow direct adoption.
IBM Granite on 2026-03-20 released Mellea 0.4.0 and three Granite Libraries built around Granite 4.0 Micro. The release is aimed at teams that want more structured, schema-safe, and safety-aware agentic RAG pipelines instead of depending on prompt-only orchestration.
Google put Gemini Embedding 2 into public preview on March 10, 2026. The company says the model handles text, images, and mixed multimodal documents in one embedding space while improving benchmark scores to 68.32 for text and 53.3 for image tasks without changing price or vector dimensions.
A Hacker News thread drew attention to CodeWall's March 9 disclosure on McKinsey's Lilli platform, where an autonomous agent reportedly chained unauthenticated endpoints, SQL injection, and prompt-layer access into full production-database compromise.
A Hacker News discussion around Amine Raji's local ChromaDB lab highlights a practical risk in RAG systems: attackers can win by contaminating the source corpus, and the strongest defense may sit at ingestion rather than in the prompt.