OpenAI announced GPT-5.3 Codex Spark on February 12, 2026, positioning it as a coding-focused model optimized for practical throughput and cost efficiency. The company reports lower latency and token cost versus GPT-5.2 while maintaining strong benchmark results.
#llm
RSS FeedA r/LocalLLaMA post on Qwen3.5 gained 123 upvotes and pointed directly to public weights and model documentation. The linked card confirms key specs including 397B total parameters, 17B activated, and 262,144 native context length.
A high-engagement r/LocalLLaMA thread tracked the MiniMax-M2.5 release on Hugging Face. The model card emphasizes agentic coding/search benchmarks, runtime speedups, and aggressive cost positioning.
A Show HN post introduces Off Grid, an open-source Android/iOS app that runs chat, image generation, vision, and speech transcription entirely on-device without cloud data transfer.
A widely discussed Hacker News post compares Anthropic and OpenAI fast modes and argues that LLM speed gains are increasingly driven by serving architecture, not just model quality.
A high-signal Hacker News discussion on GPT-5.3-Codex-Spark points to a shift toward low-latency coding loops: 1000+ tokens/s claims, transport and kernel optimizations, and patch-first interaction design.
Anthropic released Claude Opus 4.6, achieving industry-leading performance in coding, long-context retrieval, and knowledge work.
DeepSeek is set to launch its next-generation coding-focused AI model V4 in mid-February, featuring 1M+ token context windows and consumer GPU support for unprecedented developer accessibility.
A researcher dramatically improved 15 LLMs' coding performance with a single change. By redesigning the edit tool rather than the model, Grok Code Fast's success rate jumped 10x from 6.7% to 68.3%.
China's GLM-5 model achieves a score of 50 on the Intelligence Index, claiming top performance among open-source large language models.