LLM

LLM Hacker News Mar 13, 2026 2 min read

Hacker News spots CanIRun.ai, a browser-side local AI compatibility checker

CanIRun.ai runs entirely in the browser, detects GPU, CPU, and RAM through WebGL, WebGPU, and navigator APIs, and estimates which quantized models fit your machine. HN readers liked the idea but immediately pushed on missing hardware entries, calibration, and reverse-lookup features.

#local-ai #llm-inference #hardware

LLM Mar 13, 2026 2 min read

NVIDIA releases open Nemotron 3 Super with 1M context and up to 5x higher throughput for agentic AI

NVIDIA introduced Nemotron 3 Super on March 11, 2026 as an open 120B-parameter model built for agentic AI systems. The company says the model tackles long-context cost and reasoning overhead with a 1M-token window, hybrid MoE design and up to 5x higher throughput.

#nvidia #nemotron #agentic-ai

LLM Mar 13, 2026 2 min read

Google opens Gemini Embedding 2 preview as its first natively multimodal embedding model

Google has put Gemini Embedding 2 into public preview through the Gemini API and Vertex AI. The model is Google’s first natively multimodal embedding system, combining text, image, video, audio, and document inputs in one embedding space.

#google #gemini #embeddings

LLM Mar 13, 2026 2 min read

OpenAI brings Codex Automations to general availability with model, branch, and template controls

OpenAI says Codex Automations are generally available and now expose controls for model choice, reasoning level, branch strategy, and reusable templates. The update pushes Codex from ad hoc sessions toward repeatable background workflows for software teams.

#openai #codex #automation

LLM Reddit Mar 13, 2026 2 min read

r/MachineLearning pushes back on an ICML submission that appears fully AI-written

A reviewer in r/MachineLearning says an ICML paper in a no-LLM track reads as if it was fully generated by AI, opening a blunt discussion about enforcement, review burden, and whether writing quality itself has become a policy signal.

#research #peer-review #llm-writing

LLM Hacker News Mar 13, 2026 2 min read

Document poisoning in RAG systems shows why ingestion controls matter more than output filters

A Hacker News discussion around Amine Raji's local ChromaDB lab highlights a practical risk in RAG systems: attackers can win by contaminating the source corpus, and the strongest defense may sit at ingestion rather than in the prompt.

#rag #security #retrieval

LLM Mar 13, 2026 2 min read

Microsoft unveils Frontier Suite and sets Agent 365 GA for May 1 as Copilot adds Claude

Microsoft on March 9, 2026 announced the Frontier Suite, expanded Copilot model diversity with Claude and next-generation OpenAI models, and scheduled Agent 365 general availability for May 1 at $15 per user. Microsoft 365 E7, the Frontier Suite bundle, is also set for May 1 at $99 per user.

#microsoft #copilot #agent-365

LLM Mar 13, 2026 2 min read

Google previews Gemini 3.1 Flash-Lite as its fastest and most cost-efficient Gemini 3 model

Google on March 3, 2026 introduced Gemini 3.1 Flash-Lite as the fastest and most cost-efficient model in the Gemini 3 family. The preview is rolling out through Google AI Studio and Vertex AI at $0.25/1M input tokens and $1.50/1M output tokens.

#google #gemini #flash-lite

LLM Reddit Mar 13, 2026 1 min read

OmniCoder-9B Brings Frontier Agent Traces to a 9B Open Coding Model

OmniCoder-9B packages agent-style coding behavior into a smaller open model by training on more than 425,000 curated trajectories from real tool-using workflows.

#coding-agents #open-models #qwen

LLM Reddit Mar 13, 2026 2 min read

Reddit Research Notes: A 7-Layer Duplication Trick Climbs the Open LLM Leaderboard

A post in r/MachineLearning argues that duplicating a specific seven-layer block inside Qwen2-72B improved benchmark performance without changing any weights.

#transformers #benchmarks #open-models

LLM Hacker News Mar 13, 2026 1 min read

Claude Adds Interactive Charts and Diagrams Inside Chat

Anthropic has added inline interactive visuals to Claude, and Hacker News users are treating it as a real workflow upgrade for analysis and explanation rather than a cosmetic demo.

#claude #anthropic #visualization

LLM Mar 12, 2026 2 min read

NIST AI 800-3 formalizes benchmark and generalized accuracy for AI evaluations

NIST says AI 800-3 gives evaluators a clearer statistical framework by separating benchmark accuracy from generalized accuracy and by introducing generalized linear mixed models for uncertainty estimation. The February 19, 2026 report argues that many current benchmark comparisons hide assumptions that can distort procurement, development, and policy decisions.

#nist #llm-evals #benchmarks