Reddit Turns MemPalace Into a Memory-Infrastructure Story, With Caveats Included
Original: An actress Milla Jovovich just released a free open-source AI memory system that scored 100% on LongMemEval, beating every paid solution View original →
A highly upvoted r/singularity post pulled MemPalace into the main AI feed with a strong headline: a free open-source memory system that scored 100% on LongMemEval and beat paid products. The linked GitHub repo does describe unusually strong results for long-term memory retrieval, but the more interesting part is that the maintainers used the README itself to narrow their own claims almost immediately after launch.
MemPalace’s core idea is not to let an LLM decide what is worth remembering. Instead, it stores raw conversation text locally in ChromaDB and relies on retrieval to find the relevant passages later. In the README and the project’s benchmark document, the authors say the system reaches 96.6% recall@5 on LongMemEval in raw verbatim mode with no API calls, and can hit 100% with an optional Haiku or Sonnet reranking stage. They present that as an argument against the dominant design in AI-memory tools, where another model first extracts facts or summaries and silently discards the context around them.
What kept the story from becoming pure hype was the project’s own correction note, dated April 7, 2026. In that note, the maintainers say the original README overstated a “30x lossless compression” claim, used a bad AAAK token-count example, and blurred the distinction between the raw-mode 96.6% result and the higher reranked numbers. They also say the “100% with Haiku rerank” result is real but was not yet fully reflected in the public benchmark scripts at the time of the note. That is a meaningful caveat. The repo is not just claiming state-of-the-art performance; it is also documenting where its first presentation ran ahead of the evidence available in the public code path.
Why Reddit amplified it anyway
The Reddit reaction still makes sense. A local-first memory layer that can run without a subscription, keep data on-device, expose MCP tools, and still compete with cloud-heavy memory products is exactly the kind of infrastructure story AI power users want right now. The post was less about celebrity than about a market signal: developers are increasingly skeptical of memory systems that summarize first and retrieve second. MemPalace is getting attention because it argues for the opposite baseline, and because the repo’s own self-corrections make the trade-offs clearer rather than hiding them.
Related Articles
A Hacker News thread highlighted Context Mode, an MCP server that reports reducing Claude Code tool-output context usage from 315 KB to 5.4 KB in tested workflows.
The popular text-generation-webui project, rebranded as TextGen, has relaunched as a no-install native desktop app for Windows, Linux, and macOS. Built on a minimal Electron integration, it positions itself as a fully open-source alternative to LM Studio.
The Orthrus framework achieves up to 7.8× tokens per forward pass on Qwen3 models while maintaining a provably identical output distribution to the original. Its dual-view architecture shares a single KV cache between autoregressive and diffusion pathways.