Reddit Turns MemPalace Into a Memory-Infrastructure Story, With Caveats Included
Original: An actress Milla Jovovich just released a free open-source AI memory system that scored 100% on LongMemEval, beating every paid solution View original →
A highly upvoted r/singularity post pulled MemPalace into the main AI feed with a strong headline: a free open-source memory system that scored 100% on LongMemEval and beat paid products. The linked GitHub repo does describe unusually strong results for long-term memory retrieval, but the more interesting part is that the maintainers used the README itself to narrow their own claims almost immediately after launch.
MemPalace’s core idea is not to let an LLM decide what is worth remembering. Instead, it stores raw conversation text locally in ChromaDB and relies on retrieval to find the relevant passages later. In the README and the project’s benchmark document, the authors say the system reaches 96.6% recall@5 on LongMemEval in raw verbatim mode with no API calls, and can hit 100% with an optional Haiku or Sonnet reranking stage. They present that as an argument against the dominant design in AI-memory tools, where another model first extracts facts or summaries and silently discards the context around them.
What kept the story from becoming pure hype was the project’s own correction note, dated April 7, 2026. In that note, the maintainers say the original README overstated a “30x lossless compression” claim, used a bad AAAK token-count example, and blurred the distinction between the raw-mode 96.6% result and the higher reranked numbers. They also say the “100% with Haiku rerank” result is real but was not yet fully reflected in the public benchmark scripts at the time of the note. That is a meaningful caveat. The repo is not just claiming state-of-the-art performance; it is also documenting where its first presentation ran ahead of the evidence available in the public code path.
Why Reddit amplified it anyway
The Reddit reaction still makes sense. A local-first memory layer that can run without a subscription, keep data on-device, expose MCP tools, and still compete with cloud-heavy memory products is exactly the kind of infrastructure story AI power users want right now. The post was less about celebrity than about a market signal: developers are increasingly skeptical of memory systems that summarize first and retrieve second. MemPalace is getting attention because it argues for the opposite baseline, and because the repo’s own self-corrections make the trade-offs clearer rather than hiding them.
Related Articles
GitHub said on April 1, 2026 that Agentic Workflows are built around isolation, constrained outputs, and comprehensive logging. The linked GitHub blog describes dedicated containers, firewalled egress, buffered safe outputs, and trust-boundary logging designed to let teams run coding agents more safely in GitHub Actions.
Lemonade packages local AI inference behind an OpenAI-compatible server that targets GPUs and NPUs, aiming to make open models easier to deploy on everyday PCs.
A Hacker News discussion is resurfacing a Future Shock explainer that makes LLM memory costs concrete in GPU bytes instead of abstract architecture jargon. The piece traces how GPT-2, Llama 3, DeepSeek V3, Gemma 3, and Mamba-style models handle context retention differently.
Comments (0)
No comments yet. Be the first to comment!