#llm

LLM Hacker News Feb 22, 2026 2 min read

Taalas Prints LLM Weights into Silicon: 17,000 Tokens/sec at 10x Lower Cost

Taalas has released an ASIC chip that physically etches Llama 3.1 8B model weights into silicon, achieving 17,000 tokens per second—10x faster, 10x cheaper, and 10x more power-efficient than GPU-based inference systems.

#taalas #asic #llm

LLM Feb 22, 2026 1 min read

ByteDance Launches Doubao 2.0 — Frontier-Level AI at One-Tenth the Cost

ByteDance released Doubao 2.0 ahead of Lunar New Year, claiming GPT-5.2 and Gemini 3 Pro parity with 98.3 on AIME 2025, a 3020 Codeforces rating, and pricing 10x cheaper than Western rivals.

#bytedance #llm #product-launch

AI sources.twitter Feb 22, 2026 1 min read

Karpathy: LLMs Are Rewriting the Rules of Software — All Code Will Be Rewritten Many Times Over

AI researcher Andrej Karpathy argues that LLMs fundamentally change software constraints, excelling at code translation. He predicts large fractions of all software ever written will be rewritten many times over as AI reshapes the programming landscape.

#karpathy #llm #software-engineering

AI Reddit Feb 22, 2026 1 min read

Taalas: Etching LLM Weights Directly into Silicon Achieves 16,000 Tokens/Second

Startup Taalas is taking a radical approach to AI inference: etching LLM model weights and architecture directly into a silicon chip. Their Llama 3.1 8B demo achieves 16,000 tokens per second — but the approach bets that model architectures won't change.

#ai-hardware #silicon #llm

LLM sources.twitter Feb 22, 2026 1 min read

Google DeepMind Launches Gemini 3.1 Pro with Significantly Improved Overall Intelligence

Google DeepMind announced Gemini 3.1 Pro, featuring major improvements to overall model intelligence for tackling tougher problems. Rolling out to Google AI Pro and Ultra subscribers in the Gemini app and NotebookLM, with API preview in Google AI Studio.

#google-deepmind #gemini #gemini-3.1

LLM Feb 20, 2026 2 min read

OpenAI publishes First Proof model submissions

OpenAI published five model-generated submissions to the First Proof math challenge. None were accepted as valid solutions, but the release gives researchers direct evidence of where frontier reasoning systems succeed and fail.

#openai #reasoning #math

LLM Hacker News Feb 20, 2026 2 min read

Taalas proposes model-specific silicon for low-latency AI inference

A high-engagement Hacker News thread spotlights Taalas’ claim that model-specific silicon can cut inference latency and cost, including a hard-wired Llama 3.1 8B deployment reportedly reaching 17K tokens/sec per user.

#llm #inference #ai-hardware

LLM Feb 20, 2026 2 min read

Anthropic Commits to Keeping Claude Ad-Free, Framing AI Chats as a Trust Surface

In a February 4, 2026 post, Anthropic said Claude conversations will remain ad-free and not include unsolicited product placements. The company argues that conversational AI requires clearer trust incentives than ad-supported feed or search models.

#anthropic #claude #llm

LLM Hacker News Feb 20, 2026 2 min read

Gemini 3.1 Pro Launches as Google Targets Complex Reasoning Work

A top Hacker News discussion tracked Google’s Gemini 3.1 Pro rollout. Google positions it as a stronger reasoning baseline, highlighting a 77.1% ARC-AGI-2 score and broad preview availability across developer, enterprise, and consumer channels.

#gemini #google #llm

LLM Hacker News Feb 19, 2026 2 min read

HN Spotlights Step 3.5 Flash: Open-Source 196B MoE Model Aiming for Fast Agentic Reasoning

A high-signal Hacker News post highlighted StepFun's Step 3.5 Flash launch, describing a 196B-parameter MoE foundation model with about 11B active parameters, 256K context, and vendor-reported coding/agent benchmarks.

#stepfun #open-source #llm

LLM Hacker News Feb 18, 2026 2 min read

Claude Sonnet 4.6 launched: 1M context, same pricing, stronger real-world automation

Anthropic introduced Claude Sonnet 4.6 with a 1M token context window (beta), stronger coding/computer-use performance, and unchanged API pricing at $3/$15 per million tokens.

#anthropic #claude #sonnet

LLM Feb 17, 2026 2 min read

Anthropic introduces Claude Sonnet 4.6 with 1M token context beta while holding API pricing flat

Anthropic announced Claude Sonnet 4.6 on February 17, 2026, positioning it as a full upgrade across coding, computer use, and long-context reasoning. The model becomes default for Free/Pro users and keeps Sonnet 4.5 API pricing at $3/$15 per million tokens.

#anthropic #claude #sonnet-4-6