#retrieval

AI sources.twitter 3d ago 2 min read

Gemini Embedding 2 reaches GA for five-modality retrieval

Why it matters: retrieval stacks are being pulled from text-only search into multimodal memory. Google AI Studio said Gemini Embedding 2 is generally available and covers text, image, video, audio, and documents through one model path.

#google #gemini #embeddings

LLM sources.twitter 3d ago 1 min read

Perplexity says Qwen post-training beats GPT on factuality cost

Why it matters: search products need factuality and citations, not just fluent answers. Perplexity said its SFT + RL pipeline lets Qwen models match or beat GPT models on factuality at lower cost.

#perplexity #qwen #retrieval

LLM sources.twitter Apr 10, 2026 2 min read

Databricks argues memory, not reasoning alone, is the next scaling bottleneck for AI agents

On April 10, 2026, Databricks AI Research published Memory Scaling for AI Agents, arguing that agent performance can improve as external memory grows. The post reports gains in both accuracy and efficiency from labeled examples, raw conversation logs, and organizational knowledge.

#databricks #ai-agents #memory

LLM Hacker News Apr 4, 2026 2 min read

Mintlify Replaces RAG with a Virtual Filesystem for Its Docs Assistant

Mintlify says chunked RAG was too limited for docs exploration, so it built ChromaFs, a virtual filesystem over Chroma that cuts assistant session creation from about 46 seconds to about 100ms. HN readers were notably receptive to the filesystem-first design and the argument that agent tooling benefits from interpretable, UNIX-like retrieval.

#rag #agents #docs

AI Hacker News Mar 25, 2026 2 min read

Hacker News highlights DuckDB vector search fixes with ACORN-1 and RaBitQ

Hacker News picked up a DuckDB community extension that fixes filtered HNSW search and adds aggressive vector compression, making retrieval workloads more predictable under real SQL filters.

#duckdb #vector-search #acorn

LLM sources.twitter Mar 22, 2026 2 min read

Google launches Gemini Embedding 2 for unified text, image, audio, video, and document search

Google AI Studio promoted Gemini Embedding 2 in a March 12, 2026 X post, and Google’s March 10 blog post says the model maps text, images, video, audio, and documents into a single embedding space. Google says it is in public preview through the Gemini API and Vertex AI and is designed for multimodal retrieval and classification.

#google #gemini #embeddings

LLM Reddit Mar 22, 2026 2 min read

r/LocalLLaMA Highlights Graph-RAG Work That Lets Llama 8B Challenge 70B Multi-Hop QA

A fresh r/LocalLLaMA post argues that the main bottleneck in Graph-RAG multi-hop QA is often reasoning rather than retrieval. The linked paper suggests structured prompting and graph-based context compression can let an open Llama 8B model match or beat a plain 70B baseline at a much lower cost.

#graph-rag #llama #reasoning

AI Reddit Mar 13, 2026 2 min read

Reddit debates an AI memory system that forgets on purpose instead of storing everything

A post in r/artificial argues that long-running agents may need decay, reinforcement, and selective forgetting more than another vector database, prompting a discussion about episodic memory, compression, and retrieval quality.

#ai-agents #memory #retrieval

LLM Hacker News Mar 13, 2026 2 min read

Document poisoning in RAG systems shows why ingestion controls matter more than output filters

A Hacker News discussion around Amine Raji's local ChromaDB lab highlights a practical risk in RAG systems: attackers can win by contaminating the source corpus, and the strongest defense may sit at ingestion rather than in the prompt.

#rag #security #retrieval

LLM sources.twitter Feb 27, 2026 1 min read

Perplexity Launches `pplx-embed` Family for Web-Scale Retrieval with INT8 and Binary Outputs

Perplexity announced on February 26, 2026 that `pplx-embed-v1` and `pplx-embed-context-v1` are now available in 0.6B and 4B variants. The company positions the release as retrieval-first infrastructure with quantized embeddings and benchmark-focused performance claims.

#perplexity #embeddings #retrieval