LLM

LLM X/Twitter Mar 19, 2026 2 min read

OpenAI opens Codex Security research preview for context-aware application security review

OpenAI said on March 6, 2026 that Codex Security is entering research preview for ChatGPT Pro, Enterprise, Business, and Edu users in Codex web. The company says the application-security agent uses project-specific threat models, contextual validation, and patch proposals, and in beta scanned more than 1.2 million commits.

#openai #codex-security #application-security

LLM X/Twitter Mar 19, 2026 2 min read

GitHub upgrades Copilot coding agent with model selection, self-review, security checks, and CLI handoff

GitHub said in a March 17, 2026 X thread that Copilot coding agent now adds model selection, self-review before PRs, built-in code/secret/dependency scanning, custom agents, and cloud-to-CLI handoff. GitHub’s blog frames the upgrade as a smoother delegation workflow for background coding tasks.

#github #copilot #coding-agent

LLM Reddit Mar 19, 2026 2 min read

LocalLLaMA highlights Mamba-3, a state space model built around inference efficiency

A LocalLLaMA thread on March 18, 2026 pushed fresh attention toward Mamba-3, a new state space model release from researchers at Carnegie Mellon University, Princeton, Cartesia AI, and Together AI. The project shifts its design goal from training speed to inference efficiency and claims prefill+decode latency wins over Mamba-2, Gated DeltaNet, and Llama-3.2-1B at the 1.5B scale.

#mamba-3 #ssm #inference

LLM Mar 19, 2026 2 min read

NVIDIA moves Dynamo 1.0 into production as an inference operating system for AI factories

At GTC on March 16, 2026, NVIDIA announced Dynamo 1.0 as a production-grade open source inference stack for generative and agentic AI. NVIDIA says Dynamo can boost Blackwell inference performance by up to 7x while integrating with major frameworks and cloud providers.

#nvidia #dynamo #inference

LLM Mar 19, 2026 2 min read

Google adds context circulation, tool combos, and Maps grounding to the Gemini API

Google on Mar 17, 2026 introduced new Gemini API features for agentic workflows, including combined built-in and custom tools, context circulation across tool calls, and Maps grounding for Gemini 3. The update is designed to reduce orchestration work for complex multi-step applications.

#google #gemini #api

LLM Hacker News Mar 19, 2026 2 min read

Hacker News Spots GreenBoost, a Linux stack that stretches GPU VRAM with system RAM and NVMe

A March 15, 2026 Hacker News post about GreenBoost reached 124 points and 25 comments. The open-source Linux project combines a kernel module and CUDA shim to tier model memory across VRAM, DDR4, and NVMe so larger local LLMs can run without changing inference apps.

#nvidia #gpu-memory #local-llm

LLM Mar 18, 2026 2 min read

Google launches Gemini 3.1 Flash-Lite for high-volume AI workloads at lower cost

Google introduced Gemini 3.1 Flash-Lite on March 3, 2026 as its fastest and most cost-efficient Gemini 3 series model. The model is rolling out in preview through the Gemini API in Google AI Studio and Vertex AI, with pricing of $0.25/1M input tokens and $1.50/1M output tokens, plus claims of a 2.5x faster Time to First Answer Token and 45% higher output speed than 2.5 Flash.

#google #gemini #flash-lite

LLM X/Twitter Mar 18, 2026 2 min read

OpenAI rolls out GPT-5.4 mini and nano for faster coding, computer use, and subagents

OpenAI said on March 17, 2026 that GPT-5.4 mini is now available in ChatGPT, Codex, and the API, with a follow-up post confirming GPT-5.4 nano in the API. OpenAI's developer docs position mini as its strongest mini model yet for coding, computer use, and subagents, while nano is framed as the cheapest GPT-5.4-class model for high-volume tasks like ranking, extraction, and sub-agent work.

#openai #gpt-5.4-mini #gpt-5.4-nano

LLM Reddit Mar 18, 2026 2 min read

LocalLLaMA Spots Hugging Face’s hf-agents as a One-Command Path to a Local Coding Agent

A March 17, 2026 r/LocalLLaMA post with 534 points and 69 comments highlighted Hugging Face’s new hf-agents CLI extension. The tool chains llmfit, llama.cpp, and Pi so users can move from hardware detection to a running local coding agent in one command.

#huggingface #llama.cpp #local-llm

LLM Hacker News Mar 18, 2026 2 min read

Hacker News Tracks GPT-5.4 Mini and Nano as OpenAI Pushes Small Models Into Codex and Agent Work

A March 17, 2026 Hacker News post about GPT-5.4 mini and nano reached 236 points and 143 comments. OpenAI is positioning mini as a fast coding and tool-use model for Codex, the API, and ChatGPT, while nano targets cheaper classification, extraction, and subagent workloads.

#openai #gpt-5.4 #small-models

LLM X/Twitter Mar 18, 2026 1 min read

OpenAI broadens its developer model stack with GPT-5.4 mini and nano

OpenAI Developers said on X that GPT-5.4 mini and nano are now part of the GPT-5.4 family for developer workflows. OpenAI positions mini as a faster coding and tool-use model for API, Codex, and ChatGPT, while nano is the lowest-cost option for lighter API workloads.

#openai #gpt-5.4 #small-models

LLM Reddit Mar 18, 2026 2 min read

r/MachineLearning highlights mlx-tune for Apple Silicon LLM fine-tuning with an Unsloth-style API

A project post in r/MachineLearning points to mlx-tune, a library that wraps Apple’s MLX stack in an Unsloth-compatible training API for SFT, DPO, GRPO, LoRA, and vision-language fine-tuning on Apple Silicon Macs.

#apple-silicon #mlx #fine-tuning