A well-received r/LocalLLaMA experiment described tinyforge: Qwen 3.5 0.8B running on a MacBook Air, trained on 13 self-generated repair pairs from a test-feedback loop, with a reported holdout jump from 16/50 to 28/50.
#small-models
A widely-shared r/LocalLLaMA comparison of Qwen's smallest models across three generations (score: 681) reveals extraordinary efficiency gains. The Qwen 3.5 9B now outperforms the previous-generation 80B on several benchmarks, while the 2B handles video understanding better than many 7B models.
Alibaba Qwen team released the Qwen 3.5 small model series (0.8B to 9B). Models run in-browser via WebGPU and show dramatic benchmark improvements over previous generations.
Researchers have demonstrated that transformer models with fewer than 100 parameters can add two 10-digit numbers with 100% accuracy. The key ingredient is digit tokenization rather than treating numbers as opaque strings — a finding with implications for mathematical reasoning in larger LLMs.