Alibaba's Qwen team has released Qwen 3.5 Small, a new small dense model in their flagship open-source series. The announcement topped r/LocalLLaMA with over 1,000 upvotes, reflecting the local AI community's enthusiasm for capable small models.
#qwen
RSS FeedThe r/LocalLLaMA community is buzzing over Qwen 3.5-35B-A3B, which users report outperforms GPT-OSS-120B while being only one-third the size, making it an excellent local daily driver for development tasks.
A high-engagement r/LocalLLaMA thread reviewed Unsloth’s updated Qwen3.5-35B-A3B dynamic quantization release, including KLD/PPL data, tensor-level tradeoffs, and reproducibility artifacts.
A high-engagement LocalLLaMA follow-up benchmark reports that Qwen3.5-35B-A3B runs best on the tested RTX 5080 setup with Q4_K_M quantization, KV q8_0, and --fit without explicit batch flags.
A high-traffic LocalLLaMA thread tracked the release of Qwen3.5-122B-A10B on Hugging Face and quickly shifted into deployment questions. Community discussion centered on GGUF timing, quantization choices, and real-world throughput, while the model card highlighted a 122B total/10B active MoE design and long-context serving guidance.
A high-engagement r/LocalLLaMA thread reports strong early results for Qwen3.5-35B-A3B in local agentic coding workflows. The original poster cites 100+ tokens/sec on a single RTX 3090 setup, while comments show mixed reproducibility and emphasize tooling, quantization, and prompt pipeline differences.
A high-engagement r/LocalLLaMA post surfaced the Qwen3.5-35B-A3B model card on Hugging Face. The card emphasizes MoE efficiency, long context handling, and deployment paths across common open-source inference stacks.
Users on r/LocalLLaMA have spotted Qwen3.5 model names appearing in Alibaba's official Qwen chat interface, signaling an imminent release of the next generation of Alibaba's open-source LLM series.
Alibaba launched Qwen3.5, a 397B-parameter open-weight multimodal model supporting 201 languages. The company claims it outperforms GPT-5.2, Claude Opus 4.5, and Gemini 3 on benchmarks, while costing 60% less than its predecessor.
The Qwen research team has officially confirmed through a published paper that GPQA and HLE (Humanity's Last Exam) benchmark datasets contain serious quality issues — including OCR errors, incorrect gold-standard answers, and unverifiable questions — casting doubt on the reliability of current AI model evaluations.
A r/LocalLLaMA post on Qwen3.5 gained 123 upvotes and pointed directly to public weights and model documentation. The linked card confirms key specs including 397B total parameters, 17B activated, and 262,144 native context length.