#open-weights

LLM Reddit 15h ago 2 min read

DeepSeek V4 Lands on Hugging Face and LocalLLaMA Immediately Starts Doing the RAM Math

LocalLLaMA did not just celebrate the DeepSeek V4 release. The thread instantly turned into a collective calculation about 1M context, activated parameters, and what this actually means for real hardware, with MIT license praise mixed in.

#deepseek-v4 #open-weights #moe

AI sources.twitter 1d ago 2 min read

DeepSeek-V4 opens 1M context with 1.6T/49B and 284B/13B split

Why it matters: open models rarely arrive with both giant context claims and deployable model splits. DeepSeek put hard numbers on the release with a 1M-context design, a 1.6T/49B Pro model, and a 284B/13B Flash variant.

#deepseek #open-weights #llm

LLM Reddit 1d ago 2 min read

LocalLLaMA Treats Qwen 3.6 27B as a Dense-Model Moment, Not Just Another Release

LocalLLaMA reacted like dense models had suddenly become fun again. The official Qwen numbers were strong, but the real community energy came from people immediately asking about quants, GGUF builds, and whether 27B had become the practical sweet spot. By crawl time on April 25, 2026, the thread had 1,688 points and 603 comments.

#qwen #open-weights #coding-models

LLM Reddit 2d ago 2 min read

LocalLLaMA Sees Qwen3.6 27B as the Small Open Model That Got Too Close for Comfort

LocalLLaMA upvoted this because a 27B open model suddenly looked competitive on agent-style work, not because everyone agreed on the benchmark. The thread stayed lively precisely because the result felt important and a little suspicious at the same time.

#qwen #open-weights #benchmarks

AI 2d ago 2 min read

OpenAI ships Privacy Filter, a 1.5B open model for local PII masking

Privacy tooling usually breaks at scale or forces raw text onto a server. OpenAI’s 1.5B open-weight Privacy Filter runs locally, handles 128,000-token inputs, and posts 97.43% F1 on a corrected PII-Masking-300k benchmark.

#openai #privacy #security

LLM Hacker News 2d ago 2 min read

HN Spots the Real DeepSeek V4 Story: The Docs Link Was Thin, but the Weights Were Already Live

HN did not latch onto DeepSeek V4 because of a polished launch page. The thread took off when commenters realized the front-page link was just updated docs while the weights and base models were already live for inspection.

#deepseek #llm #moe

LLM Hacker News 3d ago 1 min read

Why HN cared more about Qwen3.6’s 27B dense form than the benchmark table

HN read Qwen3.6-27B less as another scorecard win and more as an open coding model people can plausibly run. The comments focused on memory footprint, self-hosting, and the operational simplicity of a dense model.

#qwen #qwen3.6 #coding-model

LLM sources.twitter 3d ago 2 min read

Qwen3.6-27B beats Qwen3.5-397B on coding and ships under Apache 2.0

Why it matters: an open-weight 27B dense model is now being pitched against much larger coding systems on real agent tasks. Qwen’s own model card lists SWE-bench Verified at 77.2 for Qwen3.6-27B versus 76.2 for Qwen3.5-397B-A17B, with Apache 2.0 licensing.

#qwen #open-weights #coding-models

LLM Reddit 4d ago 1 min read

LocalLLaMA Jumps on Qwen3.6-27B: 27B Dense Model, 262K Context

LocalLLaMA treated Qwen3.6-27B like a practical ownership moment: not just a model card, but a race to quantize, run, and compare it locally.

#qwen #local-llm #open-weights

LLM Hacker News 4d ago 2 min read

Kimi K2.6 turned HN’s model debate toward open-weight coding agents

HN read Kimi K2.6 as a test of whether open-weight coding agents can last through real engineering work. The 12-hour and 13-hour coding cases drew attention, while commenters immediately pressed on speed, provider accuracy, and benchmark realism.

#kimi #coding-agents #open-weights

AI sources.twitter Apr 17, 2026 2 min read

Qwen3.6-35B-A3B opens 35B MoE weights with 3B active parameters

Why it matters: Alibaba is putting a small-active-parameter multimodal coding model into open weights rather than keeping it API-only. The tweet says Qwen3.6-35B-A3B has 35B total parameters, 3B active parameters, and an Apache 2.0 license; the blog reports 73.4 on SWE-bench Verified and 51.5 on Terminal-Bench 2.0.

#qwen #open-weights #moe

LLM Hacker News Apr 16, 2026 1 min read

HN Sees Qwen3.6-35B-A3B as a Small Active-Parameter Bet for Coding Agents

HN latched onto the open-weight angle: a 35B MoE model with only 3B active parameters is interesting if it can actually carry coding-agent work. Qwen says Qwen3.6-35B-A3B improves sharply over Qwen3.5-35B-A3B, while commenters immediately moved to GGUF builds, Mac memory limits, and whether open-model-only benchmark tables are enough context.

#qwen #open-weights #coding-agents