#mixture-of-experts

LLM Hacker News May 2, 2026 1 min read

DeepSeek V4, 프런티어급 성능에 1/10 가격 — 1.6조 파라미터 오픈웨이트 공개

DeepSeek이 DeepSeek-V4-Pro와 V4-Flash 두 가지 모델을 공개했다. Pro는 1.6조 파라미터(활성 49B)의 Mixture-of-Experts 구조로, 현재까지 공개된 오픈웨이트 모델 중 최대 규모다. 가격은 GPT-5.4와 Gemini 3.1 Pro 대비 절반 이하로, 비용 효율성이 핵심 차별점이다.

#deepseek #llm #open-weights

LLM Reddit Mar 25, 2026 1 min read

r/LocalLLaMA가 주목한 GigaChat 3.1 open weights, 10B부터 702B까지

r/LocalLLaMA는 GigaChat 3.1에 강하게 반응했다. 이번 공개는 local-friendly 10B A1.8B MoE와 702B frontier-scale MoE를 모두 아우르며, 둘 다 MIT terms 아래 공개됐고 둘 다 scratch부터 학습했다고 제시됐다.

#open-weights #gigachat #mixture-of-experts

LLM Hacker News Mar 23, 2026 2 min read

Flash-MoE, 48GB MacBook Pro에서 397B Qwen 모델 구동 실험 공개

Hacker News에서 주목받은 Flash-MoE는 SSD 스트리밍과 Metal 커널을 이용해 Qwen3.5-397B-A17B를 48GB M3 Max 노트북에서 대화 가능한 속도로 실행하는 방법을 공개했다.

#llm #mixture-of-experts #metal

LLM Reddit Mar 12, 2026 1 min read

r/LocalLLaMA가 본 NVIDIA Nemotron 3 Super 공개

NVIDIA의 Nemotron 3 Super는 120B total / 12B active hybrid Mamba-Transformer MoE, native 1M-token context, 그리고 open weights·datasets·recipes를 함께 내세운다. LocalLLaMA discussion은 이 openness와 efficiency claim이 실제 home-lab deployment로 이어질 수 있는지에 집중했다.

#nvidia #open-weights #mixture-of-experts