A Show HN post introduces Off Grid, an open-source Android/iOS app that runs chat, image generation, vision, and speech transcription entirely on-device without cloud data transfer.
LLM
RSS FeedOpenAI reports that, across more than one million ChatGPT conversations, the share of difficult interactions exceeding a human baseline increased roughly fourfold from September 2024 to January 2026. The company also shows large gains in case-interview and puzzle-style open tasks.
A Reddit thread amplified an Ars Technica report that Google detected a 100,000+ prompt extraction campaign against Gemini, reopening questions about distillation, defense, and IP boundaries.
A widely discussed Hacker News post compares Anthropic and OpenAI fast modes and argues that LLM speed gains are increasingly driven by serving architecture, not just model quality.
NIST’s CAISI released draft guidance NIST AI 800-2 for automated language-model benchmark evaluations and opened comments through March 31, 2026. The draft focuses on objective setting, execution methodology, and analysis/reporting quality.
OpenAI said on January 29, 2026 that ChatGPT would stop offering GPT-4o and older model options from February 13, 2026. GPT-4o, GPT-4.5, and o4-mini are being replaced by GPT-5, GPT-5 thinking, and o5-mini respectively.
Technical summary of "KaniTTS2 — open-source 400M TTS model with voice cloning, runs in 3GB VRAM. Pretrain code included.", a high-signal post from Reddit r/LocalLLaMA. Based on visible community indicators (score 456, comments 84), this article highlights practical checks before adoption.
Technical summary of "OpenAI Says Internal Model May Have Solved 6 Frontier Research Problems.", a high-signal post from Reddit r/singularity. Based on visible community indicators (score 536, comments 100), this article highlights practical checks before adoption.
Anthropic announced on January 28, 2026 that ServiceNow selected Claude as its default model for AI agent development. ServiceNow cited up to 95% productivity gains in some workflows and reported large-scale AI request volumes.
A popular r/LocalLLaMA post details Heretic 1.2 with PEFT/LoRA updates, optional 4-bit processing, MPOA support, VL coverage, and automatic resume features for long local optimization runs.
A high-signal Hacker News discussion on GPT-5.3-Codex-Spark points to a shift toward low-latency coding loops: 1000+ tokens/s claims, transport and kernel optimizations, and patch-first interaction design.
A high-signal r/LocalLLaMA thread tracked the merge of llama.cpp PR #19375 and highlighted practical throughput gains for Qwen3Next models. Both PR benchmarks and community tests suggest meaningful t/s improvements from graph-level copy reduction.