#qwen

LLM Reddit 7h ago 2 min read

Qwen3.6 27B Hits 100 tps on One RTX 5090, and LocalLLaMA Immediately Asks About Quality

LocalLLaMA was interested for a reason beyond a flashy speed number. A post claiming 105-108 tps and a full 256k native context window for Qwen3.6-27B-INT4 on a single RTX 5090 turned the thread into a practical discussion about how much quality survives once local inference gets this fast.

#qwen #vllm #rtx-5090

AI 12h ago 2 min read

Qwen Image 2.0 Pro reaches No. 9 with stronger multilingual text output

Text rendering is still a weak spot for image models, so Qwen’s latest release matters because it pairs prompt control with a top-10 benchmark. The team tied the launch to a No. 9 global Text-to-Image result and follow-up examples claiming cleaner multilingual typography.

#qwen #image-generation #benchmarks

LLM Reddit 23h ago 2 min read

Qwen3.6-27B Hits Sonnet Territory, and LocalLLaMA Starts Arguing About What Counts

LocalLLaMA lit up at the idea that a 27B model could tie Sonnet 4.6 on an agentic index, but the thread turned just as fast to benchmark gaming, real context windows, and what people can actually run at home.

#qwen #local-llm #benchmarks

LLM Reddit 1d ago 2 min read

LocalLLaMA Sees a New Local Bar: Qwen 3.6 27B at ~80 t/s on One RTX 5090

r/LocalLLaMA reacted because this was not just another “new model out” post. The claim was concrete: Qwen3.6-27B running at about 80 tokens per second with a 218k context window on a single RTX 5090 via vLLM 0.19.

#qwen #vllm #rtx-5090

LLM Reddit 1d ago 2 min read

LocalLLaMA Jumps on a KV-Cache Benchmark That Breaks the "q8_0 Is Basically Free" Myth

LocalLLaMA reacted because the post did not just tweak a benchmark table. It went after a widely repeated local-inference assumption and showed that the answer changes sharply by model family, especially for Gemma. By crawl time on April 25, 2026, the thread had 324 points and 58 comments.

#kv-cache #gemma #qwen

LLM Reddit 1d ago 2 min read

LocalLLaMA Treats Qwen 3.6 27B as a Dense-Model Moment, Not Just Another Release

LocalLLaMA reacted like dense models had suddenly become fun again. The official Qwen numbers were strong, but the real community energy came from people immediately asking about quants, GGUF builds, and whether 27B had become the practical sweet spot. By crawl time on April 25, 2026, the thread had 1,688 points and 603 comments.

#qwen #open-weights #coding-models

LLM Reddit 2d ago 2 min read

LocalLLaMA Sees Qwen3.6 27B as the Small Open Model That Got Too Close for Comfort

LocalLLaMA upvoted this because a 27B open model suddenly looked competitive on agent-style work, not because everyone agreed on the benchmark. The thread stayed lively precisely because the result felt important and a little suspicious at the same time.

#qwen #open-weights #benchmarks

LLM Reddit 2d ago 2 min read

LocalLLaMA Hears a Breakthrough in Qwen3 TTS: Real-Time, Local, and Finally Expressive

LocalLLaMA was not impressed by another TTS clip so much as by a build log. The post that took off showed Qwen3-TTS running locally in real time, quantized through llama.cpp, with extra alignment work to make subtitles and lip sync behave.

#qwen #tts #llama.cpp

LLM Reddit 2d ago 2 min read

LocalLLaMA Rallies Around a Qwen3.6 Result That Puts the Scaffold on Trial

What energized LocalLLaMA was not just another Qwen score jump. It was the claim that changing the agent scaffold moved the same family of local models from 19% to 45% to 78.7%, making benchmark comparisons feel less settled than many assumed.

#qwen #coding-agents #benchmarks

LLM Hacker News 3d ago 1 min read

Why HN cared more about Qwen3.6’s 27B dense form than the benchmark table

HN read Qwen3.6-27B less as another scorecard win and more as an open coding model people can plausibly run. The comments focused on memory footprint, self-hosting, and the operational simplicity of a dense model.

#qwen #qwen3.6 #coding-model

LLM sources.twitter 3d ago 2 min read

Qwen3.6-27B beats Qwen3.5-397B on coding and ships under Apache 2.0

Why it matters: an open-weight 27B dense model is now being pitched against much larger coding systems on real agent tasks. Qwen’s own model card lists SWE-bench Verified at 77.2 for Qwen3.6-27B versus 76.2 for Qwen3.5-397B-A17B, with Apache 2.0 licensing.

#qwen #open-weights #coding-models

LLM sources.twitter 3d ago 1 min read

Perplexity says Qwen post-training beats GPT on factuality cost

Why it matters: search products need factuality and citations, not just fluent answers. Perplexity said its SFT + RL pipeline lets Qwen models match or beat GPT models on factuality at lower cost.

#perplexity #qwen #retrieval