GitHub announced that Anthropic's Claude Sonnet 4.6 is now generally available in GitHub Copilot. Early testing shows excellent performance for agentic coding and search operations in VS Code and Copilot CLI.
LLM
Google DeepMind announced Gemini 3.1 Pro, featuring major improvements to overall model intelligence for tackling tougher problems. Rolling out to Google AI Pro and Ultra subscribers in the Gemini app and NotebookLM, with API preview in Google AI Studio.
Alibaba launched Qwen 3.5 on February 16 under Apache 2.0, featuring 397B parameters with a sparse MoE architecture (17B active), 256K context, and native multimodal capabilities matching leading US proprietary models on key benchmarks.
Anthropic launched Claude Sonnet 4.6 on February 17, offering major upgrades in coding, computer use, and agent planning—now the default model for Free and Pro users at the same $3/$15 per million tokens pricing.
A high-signal LocalLLaMA thread points to llama.cpp Discussion #19759, where maintainers say the ggml team is joining Hugging Face while continuing full-time support for ggml and llama.cpp.
Anthropic has released Claude Code Security in limited research preview, targeting vulnerability discovery and patch suggestion workflows while keeping human approval at the center.
On 2026-02-19, Google announced Gemini 3.1 Pro and began rolling it out across developer, enterprise, and consumer surfaces. The post reports a verified ARC-AGI-2 score of 77.1% and lists immediate access via Gemini API, Gemini CLI, Vertex AI, Gemini app, and NotebookLM.
A technical r/LocalLLaMA thread pointed to llama.cpp PR #19765, merged on February 20, 2026. The patch unifies parser paths as a stop-gap for Qwen3-Coder-Next issues and adds parallel tool-calling plus JSON schema fixes.
On February 20, 2026, Anthropic introduced Claude Code Security in limited research preview. The feature scans codebases for vulnerabilities and proposes patches, while keeping final remediation decisions under human review and approval.
A high-engagement r/singularity post pointed to arXiv 2602.15322, which reports that masked adaptive updates and the proposed Magma optimizer can improve 1B-model perplexity versus Adam and Muon with minimal overhead.
A high-score Hacker News discussion surfaced Together AI's CDLM post, which claims up to 14.5x latency improvements for diffusion language models by combining trajectory-consistent step reduction with exact block-wise KV caching.
A high-scoring Hacker News thread highlighted announcement #19759 in ggml-org/llama.cpp: the ggml.ai founding team is joining Hugging Face, while maintainers state ggml/llama.cpp will remain open-source and community-driven.