Alibaba Releases Qwen3.5 Small Models: 9B Achieves GPT-oss 20B–120B Level Performance

Overview

Alibaba's Qwen team has released the Qwen3.5 small model series, comprising three sizes: 0.8B, 4B, and 9B parameters. All models are immediately available on Hugging Face with GGUF quantizations from unsloth and community contributors.

Key Performance

Community benchmarks show the Qwen3.5 9B model performing at a level comparable to GPT-oss models in the 20B–120B range — an exceptional parameter efficiency that opens up high-quality inference to users with mid-range consumer GPUs.

The 0.8B model targets mobile deployment, while the 4B model offers a compelling middle ground. The community quickly noted that disabling thinking mode and setting temperature around 0.45 yields the best results, as the models tend to overthink in reasoning tasks. Additionally, bf16 KV cache (not f16) is required for optimal performance on engines like llama.cpp.

Community Reception

The LocalLLaMA community responded with immediate enthusiasm, with quantized versions appearing within hours of release. Multiple benchmark comparisons against Qwen 3 predecessors are already being shared, showing clear improvements across standard evaluation metrics.

Availability

Models are available at Hugging Face under the Qwen organization, with GGUF variants from unsloth available for llama.cpp and compatible runtimes.

LLM Reddit Mar 3, 2026 1 min read

Qwen 3.5 Small Models Released: From 0.8B to 9B, Now Running in Browsers

Alibaba Qwen team released the Qwen 3.5 small model series (0.8B to 9B). Models run in-browser via WebGPU and show dramatic benchmark improvements over previous generations.

#qwen #llm #open-source

LLM Feb 23, 2026 1 min read

Alibaba Releases Qwen3.5: Open-Weight MoE Model Claims to Beat US Rivals

Alibaba launched Qwen3.5, a 397B-parameter open-weight multimodal model supporting 201 languages. The company claims it outperforms GPT-5.2, Claude Opus 4.5, and Gemini 3 on benchmarks, while costing 60% less than its predecessor.

#alibaba #qwen #open-source

LLM Reddit Feb 24, 2026 1 min read

New Qwen3.5 Models Spotted in Qwen Chat — Alibaba's Next LLM Release Imminent

Users on r/LocalLLaMA have spotted Qwen3.5 model names appearing in Alibaba's official Qwen chat interface, signaling an imminent release of the next generation of Alibaba's open-source LLM series.

#qwen #alibaba #llm