#llm

LLM Feb 17, 2026 1 min read

OpenAI Introduces GPT-5.3 Codex Spark With a Lower-Latency, Lower-Cost Coding Profile

OpenAI announced GPT-5.3 Codex Spark on February 12, 2026, positioning it as a coding-focused model optimized for practical throughput and cost efficiency. The company reports lower latency and token cost versus GPT-5.2 while maintaining strong benchmark results.

#openai #gpt-5.3 #codex

LLM Reddit Feb 17, 2026 2 min read

Reddit Tracks Qwen3.5 Open-Weight Release with 397B-A17B Model Card Details

A r/LocalLLaMA post on Qwen3.5 gained 123 upvotes and pointed directly to public weights and model documentation. The linked card confirms key specs including 397B total parameters, 17B activated, and 262,144 native context length.

#qwen #open-weight #multimodal

LLM Reddit Feb 16, 2026 2 min read

LocalLLaMA Spotlights MiniMax-M2.5 as Hugging Face Release Gains Traction

A high-engagement r/LocalLLaMA thread tracked the MiniMax-M2.5 release on Hugging Face. The model card emphasizes agentic coding/search benchmarks, runtime speedups, and aggressive cost positioning.

#minimax #llm #agents

LLM Hacker News Feb 16, 2026 1 min read

Show HN: Off Grid Bundles Text, Vision, Image, and Voice AI Fully Offline on Mobile

A Show HN post introduces Off Grid, an open-source Android/iOS app that runs chat, image generation, vision, and speech transcription entirely on-device without cloud data transfer.

#on-device-ai #offline-ai #mobile-ml

LLM Hacker News Feb 16, 2026 1 min read

Two Paths to Faster LLM Inference: Batch Strategy vs Specialized Compute

A widely discussed Hacker News post compares Anthropic and OpenAI fast modes and argues that LLM speed gains are increasingly driven by serving architecture, not just model quality.

#llm #inference #latency

LLM Hacker News Feb 15, 2026 1 min read

GPT-5.3-Codex-Spark on Hacker News: Real-Time Coding at 1000+ Tokens/s

A high-signal Hacker News discussion on GPT-5.3-Codex-Spark points to a shift toward low-latency coding loops: 1000+ tokens/s claims, transport and kernel optimizations, and patch-first interaction design.

#openai #codex #realtime-inference

LLM Feb 13, 2026 1 min read

Anthropic Launches Claude Opus 4.6, Outperforms GPT-5.2

Anthropic released Claude Opus 4.6, achieving industry-leading performance in coding, long-context retrieval, and knowledge work.

#anthropic #claude #llm

LLM Feb 12, 2026 1 min read

DeepSeek V4 Targets Mid-February Launch with Revolutionary Coding Capabilities

DeepSeek is set to launch its next-generation coding-focused AI model V4 in mid-February, featuring 1M+ token context windows and consumer GPU support for unprecedented developer accessibility.

#deepseek #coding #open-source

AI Hacker News Feb 12, 2026 1 min read

LLM Coding Performance: Harness Design, Not Models, Is the Key

A researcher dramatically improved 15 LLMs' coding performance with a single change. By redesigning the edit tool rather than the model, Grok Code Fast's success rate jumped 10x from 6.7% to 68.3%.

#llm #coding #performance

LLM Reddit Feb 12, 2026 2 min read

GLM-5 Scores 50 on Intelligence Index, Becomes New Open Weights Leader

China's GLM-5 model achieves a score of 50 on the Intelligence Index, claiming top performance among open-source large language models.

#glm-5 #open-source #llm