#rag

LLM 4d ago 2 min read

Google’s Agentic RAG keeps searching until enterprise answers hold up

Google Research is turning enterprise RAG into an iterative agent workflow, not a one-shot retrieval step. Its sufficient-context check lifted factuality accuracy by up to 34% and reached 90.1% accuracy in a cross-corpus FramesQA setup.

#google #rag #agents

AI Hacker News May 10, 2026 1 min read

Google Expands Gemini API File Search to Multimodal RAG

Google has updated the Gemini API File Search tool to support multimodal content including images, audio, and video, making it easier for developers to build efficient, verifiable RAG systems.

#google #gemini #rag

AI Hacker News Apr 26, 2026 2 min read

HN Pokes at Stash, an Open-Source Memory Layer for Agents

Hacker News liked the promise of model-agnostic memory, but the real energy in the thread came from one immediate question: how does this avoid context pollution? Skepticism arrived faster than praise.

#agent-memory #mcp #pgvector

LLM Apr 17, 2026 2 min read

Cloudflare gives agents BM25, vectors and per-customer search

Cloudflare is turning AutoRAG into AI Search, a retrieval primitive agents can create and query from Workers. The open beta adds BM25 plus vector search, built-in storage and index, metadata boosting, and cross-instance search with concrete free and paid limits.

#cloudflare #rag #agents

AI X/Twitter Apr 14, 2026 2 min read

MongoDB says Heidi scaled its AI scribe to 81M clinical consultations on Atlas

MongoDB said on March 20, 2026 that Heidi’s AI scribe has scaled to 81 million clinical consultations across 190+ countries in 18 months. The linked case study argues Atlas and Atlas Vector Search let Heidi consolidate heterogeneous medical data, run RAG, and scale without downtime in healthcare settings.

#mongodb #healthcare-ai #vector-search

LLM Hacker News Apr 4, 2026 2 min read

Mintlify Replaces RAG with a Virtual Filesystem for Its Docs Assistant

Mintlify says chunked RAG was too limited for docs exploration, so it built ChromaFs, a virtual filesystem over Chroma that cuts assistant session creation from about 46 seconds to about 100ms. HN readers were notably receptive to the filesystem-first design and the argument that agent tooling benefits from interpretable, UNIX-like retrieval.

#rag #agents #docs

LLM Hacker News Mar 27, 2026 2 min read

Hacker News revisits what production RAG actually takes on local models

A detailed engineering write-up resonated on Hacker News because it treated production RAG as a data and operations problem, not a prompt demo.

#rag #llamaindex #chromadb

LLM Hacker News Mar 23, 2026 2 min read

Why Teams Rebuild DSPy Patterns Even as Adoption Lags

A Hacker News thread around Skylar Payne's DSPy post argues that teams often rebuild DSPy-style LLM engineering patterns as systems mature, even though unfamiliar abstractions, Python fit, and eval design still slow direct adoption.

#dspy #llm-engineering #hacker-news

LLM Mar 21, 2026 2 min read

IBM releases Mellea 0.4.0 and Granite Libraries for structured AI workflows

IBM Granite on 2026-03-20 released Mellea 0.4.0 and three Granite Libraries built around Granite 4.0 Micro. The release is aimed at teams that want more structured, schema-safe, and safety-aware agentic RAG pipelines instead of depending on prompt-only orchestration.

#ibm #granite #rag

LLM Mar 16, 2026 2 min read

Google puts Gemini Embedding 2 into public preview as its first natively multimodal embedding model

Google put Gemini Embedding 2 into public preview on March 10, 2026. The company says the model handles text, images, and mixed multimodal documents in one embedding space while improving benchmark scores to 68.32 for text and 53.3 for image tasks without changing price or vector dimensions.

#google #embeddings #multimodal

AI Hacker News Mar 14, 2026 2 min read

Hacker News Spotlights AI-Specific SQL Injection That Exposed McKinsey's Lilli Platform

A Hacker News thread drew attention to CodeWall's March 9 disclosure on McKinsey's Lilli platform, where an autonomous agent reportedly chained unauthenticated endpoints, SQL injection, and prompt-layer access into full production-database compromise.

#ai-security #sql-injection #rag

LLM Hacker News Mar 13, 2026 2 min read

Document poisoning in RAG systems shows why ingestion controls matter more than output filters

A Hacker News discussion around Amine Raji's local ChromaDB lab highlights a practical risk in RAG systems: attackers can win by contaminating the source corpus, and the strongest defense may sit at ingestion rather than in the prompt.

#rag #security #retrieval