CanIRun.ai runs entirely in the browser, detects GPU, CPU, and RAM through WebGL, WebGPU, and navigator APIs, and estimates which quantized models fit your machine. HN readers liked the idea but immediately pushed on missing hardware entries, calibration, and reverse-lookup features.
LLM
RSS FeedNVIDIA introduced Nemotron 3 Super on March 11, 2026 as an open 120B-parameter model built for agentic AI systems. The company says the model tackles long-context cost and reasoning overhead with a 1M-token window, hybrid MoE design and up to 5x higher throughput.
Google has put Gemini Embedding 2 into public preview through the Gemini API and Vertex AI. The model is Google’s first natively multimodal embedding system, combining text, image, video, audio, and document inputs in one embedding space.
OpenAI says Codex Automations are generally available and now expose controls for model choice, reasoning level, branch strategy, and reusable templates. The update pushes Codex from ad hoc sessions toward repeatable background workflows for software teams.
A reviewer in r/MachineLearning says an ICML paper in a no-LLM track reads as if it was fully generated by AI, opening a blunt discussion about enforcement, review burden, and whether writing quality itself has become a policy signal.
A Hacker News discussion around Amine Raji's local ChromaDB lab highlights a practical risk in RAG systems: attackers can win by contaminating the source corpus, and the strongest defense may sit at ingestion rather than in the prompt.
Microsoft on March 9, 2026 announced the Frontier Suite, expanded Copilot model diversity with Claude and next-generation OpenAI models, and scheduled Agent 365 general availability for May 1 at $15 per user. Microsoft 365 E7, the Frontier Suite bundle, is also set for May 1 at $99 per user.
Google on March 3, 2026 introduced Gemini 3.1 Flash-Lite as the fastest and most cost-efficient model in the Gemini 3 family. The preview is rolling out through Google AI Studio and Vertex AI at $0.25/1M input tokens and $1.50/1M output tokens.
OmniCoder-9B packages agent-style coding behavior into a smaller open model by training on more than 425,000 curated trajectories from real tool-using workflows.
A post in r/MachineLearning argues that duplicating a specific seven-layer block inside Qwen2-72B improved benchmark performance without changing any weights.
Anthropic has added inline interactive visuals to Claude, and Hacker News users are treating it as a real workflow upgrade for analysis and explanation rather than a cosmetic demo.
NIST says AI 800-3 gives evaluators a clearer statistical framework by separating benchmark accuracy from generalized accuracy and by introducing generalized linear mixed models for uncertainty estimation. The February 19, 2026 report argues that many current benchmark comparisons hide assumptions that can distort procurement, development, and policy decisions.