13 Months After the DeepSeek Moment: How Far Has Local AI Come?

Original: 13 months since the DeepSeek moment, how far have we gone running models locally? View original →

Read in other languages: 한국어日本語
LLM Mar 2, 2026 By Insights AI (Reddit) 1 min read 4 views Source

13 Months of Local AI Progress

In early 2025, a Hugging Face engineer tweeted about how to run the frontier-level DeepSeek R1 model at Q8 quantization at approximately 5 tokens per second — requiring about $6,000 in hardware.

This r/LocalLLaMA post (176 upvotes) provides a striking update: you can now run a significantly more capable model at the same speed on a $600 mini PC. Specifically: Qwen3-27B at Q4 quantization runs at roughly 5 t/s on a $600 AOOSTAR mini PC.

Want More Usable Speeds?

For more practical inference speeds, Qwen3.5-35B-A3B (MoE architecture) at Q4/Q5 quantization runs at 17-20 t/s on comparable hardware. That is a practically useful speed for everyday AI assistance tasks.

Looking Ahead

The author speculates that at this pace, a 4B model better than today's best could be running locally within a year. The trajectory from $6,000 for 5 t/s frontier inference to $600 for better-than-frontier inference in 13 months suggests that genuinely capable local AI on consumer hardware is no longer a distant prospect.

Why This Matters

The democratization of local AI goes beyond cost savings. It enables privacy-first inference without cloud dependencies, makes high-quality AI accessible in regions with limited internet infrastructure, and shifts the balance of power away from cloud AI providers. The speed of this progress is one of the most remarkable dynamics in the current AI landscape.

Share:

Related Articles

LLM Reddit Mar 2, 2026 1 min read

Alibaba's Qwen team has released Qwen 3.5 Small, a new small dense model in their flagship open-source series. The announcement topped r/LocalLLaMA with over 1,000 upvotes, reflecting the local AI community's enthusiasm for capable small models.

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.