Qwen 3.5 Small Models Released: From 0.8B to 9B, Now Running in Browsers

Qwen 3.5 Small Models Drop

Alibaba's Qwen team has released the Qwen 3.5 small model series to massive community excitement, garnering a score of 1,663 on r/LocalLLaMA — one of the highest scores seen for a model release. The lineup includes 0.8B, 2B, 4B, and 9B parameter models.

Hybrid Architecture Innovation

Qwen 3.5 introduces a hybrid architecture combining Gated DeltaNet layers with standard Gated Attention. The 9B model features 32 layers and 4096 hidden dimensions, with an integrated vision encoder enabling multimodal capabilities. The new linear attention components improve efficiency significantly over pure transformer architectures.

Remarkable Small Model Performance

The 0.8B model runs directly in browsers via WebGPU using Transformers.js, and can execute locally on 7-year-old Android devices like the Samsung S10E. Community benchmarks across all sizes show substantial gains compared to equivalent Qwen 3 models in every category.

Practical Deployment Options

The 9B proves capable for agentic coding tasks, while the 4B runs on Raspberry Pi 5. The 2B excels at OCR, and the 0.8B sets a new bar for on-device AI on Android. Unsloth rapidly released optimized GGUF variants, making these models immediately accessible via llama.cpp and other runtimes.

Impact on Open-Source AI

This release reinforces the trajectory of small open-source models closing the gap with much larger proprietary systems. With capable models now running in browsers, on phones, and on edge hardware without cloud APIs, the democratization of AI inference is accelerating rapidly.

LLM Feb 23, 2026 1 min read

Alibaba Releases Qwen3.5: Open-Weight MoE Model Claims to Beat US Rivals

Alibaba launched Qwen3.5, a 397B-parameter open-weight multimodal model supporting 201 languages. The company claims it outperforms GPT-5.2, Claude Opus 4.5, and Gemini 3 on benchmarks, while costing 60% less than its predecessor.

#alibaba #qwen #open-source

LLM Reddit Mar 3, 2026 1 min read

Qwen 2.5 → 3 → 3.5: How Alibaba's Smallest Models Have Transformed Across Generations

A widely-shared r/LocalLLaMA comparison of Qwen's smallest models across three generations (score: 681) reveals extraordinary efficiency gains. The Qwen 3.5 9B now outperforms the previous-generation 80B on several benchmarks, while the 2B handles video understanding better than many 7B models.

#qwen #alibaba #open-source

LLM Reddit Feb 24, 2026 1 min read

New Qwen3.5 Models Spotted in Qwen Chat — Alibaba's Next LLM Release Imminent

Users on r/LocalLLaMA have spotted Qwen3.5 model names appearing in Alibaba's official Qwen chat interface, signaling an imminent release of the next generation of Alibaba's open-source LLM series.

#qwen #alibaba #llm