LLM Reddit 5h ago 2 min read
The Reddit thread focused on a practical claim with real systems implications: replace TurboQuant's dense rotation with structured rotor math, keep attention fidelity close, and make the kernel much cheaper on NVIDIA and Apple hardware.