#open-weights

LLM Reddit 3d ago 1 min read

Open-Weight AI Letter Turns Into a LocalLLaMA Policy Fight

The thread focused less on the existence of another policy letter and more on the unusual coalition behind it.

LLM X/Twitter Jul 17, 2026 1 min read

Thinking Machines opens Inkling weights for multimodal reasoning

Open-weight multimodal models just gained a serious new entrant. Thinking Machines released Inkling with full weights, 64K and 256K context options, and a direct fine-tuning path through Tinker.

#thinking-machines #inkling #open-weights

LLM Hacker News Jul 16, 2026 2 min read

Inkling shifts the open-weight question toward fine-tuning

HN readers focused less on leaderboard dominance and more on the package: Thinking Machines Lab is offering a multimodal MoE with controllable reasoning effort and Tinker-based fine-tuning as an open-weight base.

#thinking-machines #open-weights #multimodal

LLM X/Twitter Jun 21, 2026 1 min read

GLM 5.2 hits 64% on Vibe Code Bench as open weights close in

Open-weight coding models crossed a new practical threshold. Vals AI says GLM 5.2 scored 64% on Vibe Code Bench v1.1, at least 14 percentage points ahead of the next open-weight model.

#glm-5-2 #open-weights #benchmark

LLM Reddit Jun 18, 2026 2 min read

Local LLM users want the missing 80-160B middle

The LocalLLaMA thread is less about bigger models for their own sake and more about hardware buyers who now have memory capacity without a fresh model tier to use it well.

#localllama #local-llm #unified-memory

LLM Hacker News Jun 18, 2026 1 min read

GLM-5.2 pushes open weights into the cost-versus-reasoning debate

The community debate moved beyond rank: GLM-5.2 looks strong, but output-token hunger and latency now matter as much as benchmark position.

#glm #open-weights #benchmarks

LLM X/Twitter Jun 13, 2026 1 min read

MiniMax M3 weights hit Hugging Face with 428B total parameters

MiniMax has moved M3 from model teaser to open-weight distribution. The Hugging Face card lists about 428B total parameters, 23B activated parameters, and a 1M-token context window.

#minimax #open-weights #multimodal

LLM Hacker News Jun 4, 2026 1 min read

Gemma 4 12B puts the spotlight on encoder-free multimodal local AI

The thread’s energy centered on the architecture claim: what does “encoder-free” really mean for a 12B multimodal model?

#gemma #multimodal #open-weights

LLM Reddit May 26, 2026 1 min read

NuExtract3 targets local document extraction with a 4B VLM

LocalLLaMA focused less on OCR novelty and more on the practical package: open weights, self-hosting, and a low VRAM floor.

#nuextract3 #vlm #ocr

LLM Hacker News May 2, 2026 1 min read

DeepSeek V4: Near-Frontier LLM Performance at a Fraction of the Cost

DeepSeek released DeepSeek-V4-Pro (1.6T total parameters, 49B active) and V4-Flash (284B total, 13B active), both Mixture-of-Experts models with MIT license and 1M token context. V4-Pro is the largest open-weights model released so far, and its pricing at $1.74/M input undercuts GPT-5.4 and Claude Sonnet 4.6 by more than half.

#deepseek #llm #open-weights

LLM Hacker News Apr 30, 2026 2 min read

HN cared less about the launch copy than the 128B and 256K math behind Mistral Medium 3.5

Hacker News paid attention to Mistral Medium 3.5 because the size-to-capability tradeoff looked real: a 128B dense model with a 256K context window, open weights, and self-hosting claims that do not immediately drift into fantasy. The launch also tied the model to remote coding agents in Vibe and a new Work mode in Le Chat.

#mistral #open-weights #coding-agents

LLM Reddit Apr 30, 2026 2 min read

LocalLLaMA locks onto one word in Mistral Medium 3.5: dense

LocalLLaMA latched onto one detail immediately: dense 128B. Mistral Medium 3.5 drew attention because it tries to bundle reasoning, coding, and agent work into a model people can still imagine self-hosting.

#mistral #llm #open-weights