Qwen3.6 35B Transforms Workflows Through Skill-Based Prompting
Original: Qwen3.6 35Ba3 has changed my workflows and even how I use my computer View original →
How One User Rebuilt Their Workflow
A LocalLLaMA user shared a detailed account of how Qwen3.6 35B A3B changed not just their coding workflow but their overall computer usage — earning over 340 upvotes. The approach is less about raw model capability and more about a structured system built on top of it.
The Three-Layer Workflow
First, use Codex to perform specific tasks while documenting the process — including errors — as a reusable "skill." Second, feed that skill to the pi agent. Third, Qwen3.6 equipped with these skills reliably handles tasks that previously required significant trial and error.
Real Tasks Automated
- DevOps management on a VPS
- Converting old PDFs to EPUBs using Docling
The skill documentation acts as pre-loaded context: it tells the model what errors to expect and how to handle them before the task begins.
The Broader Point
Local LLM utility depends as much on how well users structure their knowledge systems as on model capability itself. Qwen3.6's reasoning paired with a well-maintained skill library and agent framework enables personal automation at a level that previously required much larger or proprietary models.
Related Articles
A community user achieved 110 tokens/second running Qwen3.6 35B A3B on an RTX 4070 Super 12GB via ik_llama.cpp, a fork with superior CPU offload optimization that significantly outperforms upstream llama.cpp's Multi-Token Prediction implementation.
A r/LocalLLaMA benchmark compared 21 local coding models on HumanEval+, speed, and memory, putting Qwen 3.6 35B-A3B on top while surfacing practical RAM and tok/s trade-offs.
LocalLLaMA reacted because --fit challenged the old rule of thumb that anything outside VRAM means painfully slow inference.
Comments (0)
No comments yet. Be the first to comment!