LLM Reddit 3h ago 1 min read
The LocalLLaMA angle is not just the 1000+ tps headline, but whether FP4, DFlash, and commodity GPU kernels can be reproduced outside Xiaomi’s hosted trial.
The LocalLLaMA angle is not just the 1000+ tps headline, but whether FP4, DFlash, and commodity GPU kernels can be reproduced outside Xiaomi’s hosted trial.
LocalLLaMA lit up because Xiaomi MiMo dropped an MIT-licensed MoE with 1.02T total parameters, 42B active parameters, and a 1M-token context window. The excitement was real, but so was the hardware reality check: people loved the openness and agentic claims while joking about how many serious GPUs you still need.