#qwen

LLM Reddit Feb 23, 2026 1 min read

Qwen Team Confirms Serious Data Quality Problems in GPQA and HLE Benchmarks

The Qwen research team has officially confirmed through a published paper that GPQA and HLE (Humanity's Last Exam) benchmark datasets contain serious quality issues — including OCR errors, incorrect gold-standard answers, and unverifiable questions — casting doubt on the reliability of current AI model evaluations.

#qwen #benchmark #gpqa

LLM Feb 22, 2026 1 min read

Alibaba Releases Qwen 3.5 Open-Source Model Claiming Frontier-Level Performance

Alibaba launched Qwen 3.5 on February 16 under Apache 2.0, featuring 397B parameters with a sparse MoE architecture (17B active), 256K context, and native multimodal capabilities matching leading US proprietary models on key benchmarks.

#alibaba #qwen #open-source

LLM Reddit Feb 17, 2026 2 min read

Reddit Tracks Qwen3.5 Open-Weight Release with 397B-A17B Model Card Details

A r/LocalLLaMA post on Qwen3.5 gained 123 upvotes and pointed directly to public weights and model documentation. The linked card confirms key specs including 397B total parameters, 17B activated, and 262,144 native context length.

#qwen #open-weight #multimodal