#dflash - Insights

LLM Reddit Apr 14, 2026 2 min read

Reddit Spots an Open-Source DFlash Runtime That Pushes Qwen3.5 to 4x Speeds on Apple Silicon

LocalLLaMA paid attention to this post because it looked like real engineering cleanup instead of another inflated speed screenshot. On April 13, 2026, the author said a stock-MLX baseline for Qwen3.5-9B at 2048 tokens improved from 30.96 tok/s to 127.07 tok/s, with 89.36% acceptance and the full runtime released as open source.

#dflash #speculative-decoding #mlx