LLM

LLM X/Twitter Mar 5, 2026 1 min read

OpenAI Starts GPT-5.4 Rollout Across ChatGPT, API, and Codex

OpenAI announced that GPT-5.4 Thinking and GPT-5.4 Pro are rolling out in ChatGPT, while GPT-5.4 is already available in the API and Codex. The launch positions GPT-5.4 as a unified frontier model for reasoning, coding, and agentic workflows.

#openai #gpt-5-4 #chatgpt

LLM Hacker News Mar 5, 2026 1 min read

Google Workspace CLI Brings Unified Workspace Automation for Humans and AI Agents

A high-ranking Hacker News post highlighted Google Workspace CLI, an open-source tool that unifies Workspace APIs behind one command surface with structured JSON output, dynamic discovery-based commands, and agent-oriented workflows.

#google-workspace #cli #developer-tools

LLM Mar 5, 2026 1 min read

Anthropic Details AI-Resistant Technical Evaluations for Engineering Hiring

In a January 21, 2026 engineering post, Anthropic explained how it repeatedly redesigned a take-home performance test as Claude models improved. The company describes how Opus 4 and Opus 4.5 changed the evaluation baseline and forced process-level updates.

#anthropic #claude #evaluation

LLM X/Twitter Mar 5, 2026 1 min read

Anthropic Says Opus 3 Will Publish on Substack for at Least 3 Months

Anthropic posted that Opus 3, after retirement interviews, will continue sharing its reflections via a Substack blog for at least the next three months. The update points to an ongoing public publishing format rather than a one-off model announcement.

#anthropic #claude #opus

LLM X/Twitter Mar 5, 2026 1 min read

Google AI Developers Announces Gemini 3.1 Flash-Lite Preview

Google AI Developers announced that Gemini 3.1 Flash-Lite is rolling out in preview via the Gemini API and Google AI Studio. The post positions it as the fastest and most cost-efficient model in the Gemini 3 line, now adding dynamic thinking for task-adaptive reasoning.

#gemini #google #api

LLM Hacker News Mar 5, 2026 2 min read

Qwen 3.5 Momentum Meets Team Upheaval at Alibaba

A high-ranking Hacker News thread highlighted a two-sided Qwen story: rapid model quality gains and potential organizational instability. As Qwen 3.5 expands across model sizes, reported leadership departures raise questions about roadmap continuity in the open-weight LLM ecosystem.

#qwen #open-weights #llm

LLM Reddit Mar 5, 2026 2 min read

LocalLLaMA spotlights Microsoft’s Phi-4-Reasoning-Vision-15B release

A high-engagement LocalLLaMA post on March 4, 2026 discussed Microsoft’s open-weight Phi-4-Reasoning-Vision-15B and focused on practical deployment tradeoffs for local multimodal inference.

#phi-4 #multimodal #vision-language

LLM Hacker News Mar 5, 2026 2 min read

NanoGPT Slowrun community debate highlights data-efficient LLM training

A March 4, 2026 Hacker News thread elevated Q Labs’ Slowrun benchmark, which fixes training data at 100M FineWeb tokens and optimizes for data efficiency under large compute budgets.

#nanogpt #data-efficiency #llm-training

LLM X/Twitter Mar 4, 2026 1 min read

NVIDIA and SGLang Claim Major DeepSeek R1 Inference Speedups

NVIDIA AI Developer says a collaboration with SGLang achieved up to 25x faster DeepSeek R1 inference on GB300 NVL72 versus H200 and an 8x GB200 NVL72 gain within months. The post attributes gains to NVFP4 precision, disaggregation, and communication-compute overlap.

#nvidia #sglang #inference

LLM X/Twitter Mar 4, 2026 1 min read

OpenAI Developers Announces Codex App for Windows

OpenAI Developers posted that the Codex app is now available on Windows with a native agent sandbox and PowerShell-oriented developer environment support. The update extends Codex usage beyond previous desktop workflows and signals deeper Windows integration for agentic coding tasks.

#openai #codex #windows

LLM Reddit Mar 4, 2026 1 min read

r/LocalLLaMA benchmark compares Qwen3.5-27B Q4 quants using KLD and size tradeoffs

A high-scoring LocalLLaMA post benchmarked Qwen3.5-27B Q4 GGUF variants against BF16, separating “closest-to-baseline” choices from “best efficiency” picks for constrained VRAM setups.

#qwen #quantization #gguf

LLM Hacker News Mar 4, 2026 1 min read

Unsloth publishes a practical Qwen3.5 fine-tuning guide with concrete VRAM targets

A high-signal Hacker News thread surfaced Unsloth’s Qwen3.5 guide, which maps model sizes to bf16 LoRA VRAM budgets and clarifies MoE, vision, and export paths for production workflows.

#qwen #fine-tuning #unsloth