LLM Hacker News Apr 2, 2026 2 min read
A Hacker News discussion is resurfacing a Future Shock explainer that makes LLM memory costs concrete in GPU bytes instead of abstract architecture jargon. The piece traces how GPT-2, Llama 3, DeepSeek V3, Gemma 3, and Mamba-style models handle context retention differently.