Alibaba Releases Qwen3.5 Small Models: 9B Achieves GPT-oss 20B–120B Level Performance
Original: Breaking: The small qwen3.5 models have been dropped View original →
Overview
Alibaba's Qwen team has released the Qwen3.5 small model series, comprising three sizes: 0.8B, 4B, and 9B parameters. All models are immediately available on Hugging Face with GGUF quantizations from unsloth and community contributors.
Key Performance
Community benchmarks show the Qwen3.5 9B model performing at a level comparable to GPT-oss models in the 20B–120B range — an exceptional parameter efficiency that opens up high-quality inference to users with mid-range consumer GPUs.
The 0.8B model targets mobile deployment, while the 4B model offers a compelling middle ground. The community quickly noted that disabling thinking mode and setting temperature around 0.45 yields the best results, as the models tend to overthink in reasoning tasks. Additionally, bf16 KV cache (not f16) is required for optimal performance on engines like llama.cpp.
Community Reception
The LocalLLaMA community responded with immediate enthusiasm, with quantized versions appearing within hours of release. Multiple benchmark comparisons against Qwen 3 predecessors are already being shared, showing clear improvements across standard evaluation metrics.
Availability
Models are available at Hugging Face under the Qwen organization, with GGUF variants from unsloth available for llama.cpp and compatible runtimes.
Related Articles
Alibaba Qwen team released the Qwen 3.5 small model series (0.8B to 9B). Models run in-browser via WebGPU and show dramatic benchmark improvements over previous generations.
Alibaba launched Qwen3.5, a 397B-parameter open-weight multimodal model supporting 201 languages. The company claims it outperforms GPT-5.2, Claude Opus 4.5, and Gemini 3 on benchmarks, while costing 60% less than its predecessor.
Users on r/LocalLLaMA have spotted Qwen3.5 model names appearing in Alibaba's official Qwen chat interface, signaling an imminent release of the next generation of Alibaba's open-source LLM series.
Comments (0)
No comments yet. Be the first to comment!