Qwen3.6-27B beats Qwen3.5-397B on coding and ships under Apache 2.0
Original: Qwen pushed Qwen3.6-27B as an Apache 2.0 open-weight dense model that beats its 397B predecessor on coding benchmarks View original →
What the tweet revealed
Qwen used a classic open-model provocation for this launch: Qwen3.6-27B surpasses Qwen3.5-397B-A17B across all major coding benchmarks and supports thinking & non-thinking modes. The post also stressed that the model is dense, open-source, and released under Apache 2.0, which is a big part of the commercial story around the Qwen line.
The Qwen account usually posts first-party model releases and ecosystem links, and this tweet follows that pattern closely. What makes it material is the specific comparison class. Qwen is not only saying the model is strong for its size. It is claiming a 27B dense model now beats a prior 397B-class family member across major coding benchmarks.
What the public model card adds
The Hugging Face README is much more useful than the tweet alone. It describes Qwen3.6-27B as the first open-weight variant of Qwen3.6, built around stability and real-world utility, with native context length of 262,144 tokens and a “thinking preservation” feature that retains reasoning context across message history. The card confirms Apache 2.0 licensing and then provides the actual benchmark grid.
On that grid, Qwen3.6-27B posts 77.2 on SWE-bench Verified versus 76.2 for Qwen3.5-397B-A17B, 53.5 versus 50.9 on SWE-bench Pro, 48.2 versus 30.0 on SkillsBench, and 36.2 versus 32.2 on NL2Repo. On Terminal-Bench 2.0 it matches the larger predecessor at 59.3. The point is not that Qwen has closed the gap to every frontier closed model; Claude 4.5 Opus still sits above it on several rows. The point is that Qwen is narrowing the cost-and-size argument for open coding systems.
What to watch next
The obvious next step is third-party verification. Teams will want to see whether the published numbers survive outside Qwen’s own harnesses, how well the thinking and non-thinking modes behave in production, and how quickly serving stacks such as vLLM and SGLang stabilize around the 262K context path. If those pieces hold, Qwen3.6-27B could become one of the more important open deployment options in the coding-model market.
Sources: X source tweet · Hugging Face model card · GitHub README
Related Articles
LocalLLaMA upvoted this because a 27B open model suddenly looked competitive on agent-style work, not because everyone agreed on the benchmark. The thread stayed lively precisely because the result felt important and a little suspicious at the same time.
LocalLLaMA reacted like dense models had suddenly become fun again. The official Qwen numbers were strong, but the real community energy came from people immediately asking about quants, GGUF builds, and whether 27B had become the practical sweet spot. By crawl time on April 25, 2026, the thread had 1,688 points and 603 comments.
The thread’s energy centered on the architecture claim: what does “encoder-free” really mean for a 12B multimodal model?