A Hacker News discussion is resurfacing a Future Shock explainer that makes LLM memory costs concrete in GPU bytes instead of abstract architecture jargon. The piece traces how GPT-2, Llama 3, DeepSeek V3, Gemma 3, and Mamba-style models handle context retention differently.
#memory
RSS FeedA post in r/artificial argues that long-running agents may need decay, reinforcement, and selective forgetting more than another vector database, prompting a discussion about episodic memory, compression, and retrieval quality.
Anthropic has launched a memory import feature that lets users bring their preferences and context from other AI providers to Claude with a single copy-paste, available on all paid plans.
Andrej Karpathy highlights the fundamental memory+compute trade-off challenge in LLMs: fast but small on-chip SRAM versus large but slow off-chip DRAM. He calls optimizing this the most intellectually rewarding puzzle in AI infrastructure today, pointing to NVIDIA's $4.6T market cap as proof.
GitHub announced public preview availability of Copilot’s cross-agent memory for Copilot coding agent, Copilot CLI, and Copilot code review. The system is repository-scoped, citation-verified, opt-in, and accompanied by reported improvements in evaluation and A/B test metrics.