LLM X/Twitter Apr 22, 2026 1 min read
Why it matters: post-training agents increasingly depend on reinforcement learning throughput, not only inference speed. NVIDIA says NeMo RL’s FP8 path speeds RL workloads by 1.48x on Qwen3-8B-Base while tracking BF16 accuracy.