A r/LocalLLaMA post on Qwen3.5 gained 123 upvotes and pointed directly to public weights and model documentation. The linked card confirms key specs including 397B total parameters, 17B activated, and 262,144 native context length.
#llm
RSS FeedA high-engagement r/LocalLLaMA thread tracked the MiniMax-M2.5 release on Hugging Face. The model card emphasizes agentic coding/search benchmarks, runtime speedups, and aggressive cost positioning.
A Show HN post introduces Off Grid, an open-source Android/iOS app that runs chat, image generation, vision, and speech transcription entirely on-device without cloud data transfer.
A widely discussed Hacker News post compares Anthropic and OpenAI fast modes and argues that LLM speed gains are increasingly driven by serving architecture, not just model quality.
A high-signal Hacker News discussion on GPT-5.3-Codex-Spark points to a shift toward low-latency coding loops: 1000+ tokens/s claims, transport and kernel optimizations, and patch-first interaction design.
Anthropic released Claude Opus 4.6, achieving industry-leading performance in coding, long-context retrieval, and knowledge work.
A researcher dramatically improved 15 LLMs' coding performance with a single change. By redesigning the edit tool rather than the model, Grok Code Fast's success rate jumped 10x from 6.7% to 68.3%.
China's GLM-5 model achieves a score of 50 on the Intelligence Index, claiming top performance among open-source large language models.