#nvidia

LLM X/Twitter 1d ago 1 min read

Nemotron 3 Nano RL Run Raises Math Accuracy From 22% to 91%

NVIDIA says a hosted RL loop lifted Nemotron 3 Nano from 22% to 91% accuracy on a math task for under $5, ending with a downloadable LoRA adapter.

#nvidia #nemotron #reinforcement-learning

LLM X/Twitter 1d ago 1 min read

NVIDIA ModelExpress Cuts DeepSeek-V4 Pro Startup From 8 Minutes

NVIDIA says ModelExpress reduced DeepSeek-V4 Pro startup from 8 minutes to 1 minute 44 seconds by moving weights directly over GPU-to-GPU RDMA.

#nvidia #modelexpress #inference

AI X/Twitter 4d ago 1 min read

Blackwell Ultra reaches 1,648 TFLOPs per GPU on DeepSeek-V3

AI infrastructure competition is being measured in training throughput, not just chip availability. NVIDIA says Blackwell Ultra reached 1,648 TFLOPs per GPU on DeepSeek-V3 671B, about 3x prior delivered performance.

#nvidia #blackwell #deepseek-v3

Sciences 5d ago 2 min read

BMS turns eight Vera Rubin racks into a drug-discovery AI factory

Bristol Myers Squibb is adding a second DGX SuperPOD built on eight DGX Vera Rubin NVL72 systems. The move turns AI infrastructure from a specialist resource into a shared platform for researchers across the company’s global drug-discovery pipeline.

#bms #nvidia #drug-discovery

AI 5d ago 2 min read

NVIDIA puts 4B Cosmos 3 Edge at the center of local physical AI

NVIDIA’s SIGGRAPH update shifts physical AI from cloud demos toward edge deployment. The package includes the 4B Cosmos 3 Edge world model, a Synthetic Video Detector NIM microservice, and a DGX Station agent stack built around Nemotron 3 Ultra.

#nvidia #cosmos #physical-ai

LLM X/Twitter Jul 18, 2026 1 min read

Nemotron 3 Embed tops LMEB with 8B first and 1B second

NVIDIA says Nemotron 3 Embed now leads LMEB, with the 8B model ranked first and the 1B model second. The linked Hugging Face discussion cites LMEB scores of 64.4 for 8B and 61.5 for 1B BF16, extending the release beyond its earlier RTEB win.

#nvidia #nemotron #embeddings

LLM X/Twitter Jul 17, 2026 1 min read

NVIDIA Nemotron 3 Embed 8B takes the top RTEB retrieval slot

Retrieval models are becoming a direct quality and cost lever for RAG and agents. NVIDIA says Nemotron 3 Embed 8B ranks first overall on RTEB, with 32k context and smaller 1B variants.

#nvidia #nemotron #retrieval

AI X/Twitter Jul 17, 2026 1 min read

NVIDIA DeepStream 9.1 adds 13 agent skills for video AI

Video analytics development is moving from hand-built pipeline wiring toward natural-language instructions plus coding agents. DeepStream 9.1 adds 13 agentic skills and JetPack 7.2 support.

#nvidia #deepstream #video-ai

Humanoid Robots sources.NVIDIA Blog Jul 16, 2026 2 min read

Jetson T3000 puts 865 FP4 TFLOPS inside smaller robot hardware

NVIDIA’s new Jetson T3000 and T2000 modules push Blackwell-class edge AI into smaller robotics systems. The real shift is cost and deployment: 865 FP4 TFLOPS, Cosmos 3 Edge, and memory-saving agent skills are being packaged for robots that cannot depend on cloud inference.

#nvidia #jetson #robotics

AI X/Twitter Jul 15, 2026 1 min read

NVIDIA Cosmos 3 post-training lifts traffic VQA to 93.35%

NVIDIA showed Cosmos 3 Nano rising from 54.41% zero-shot accuracy to 93.35% after LoRA and TAO AutoML on a traffic safety video QA task. The result frames agent-run post-training as a practical physical AI workflow.

#nvidia #cosmos #tao

LLM X/Twitter Jul 14, 2026 2 min read

NVIDIA ties LLM shape to GPU latency with 128 and 256 alignment rules

NVIDIA’s first AI Model Co-Design post argues that LLM dimensions can be as important as scale for inference performance. Its rules of thumb include 128-aligned dimensions, preference for 256 or 512, NVFP4, and parallelism strategies for MoE models.

#nvidia #llm-inference #gpu

AI Jul 8, 2026 2 min read

NVIDIA Vera targets agent loops with 1.8x sustained per-core x86 performance

NVIDIA detailed Vera, a CPU designed for agentic AI workloads where tool calls, code execution, retrieval, and verification sit between model calls. The company claims 50% higher IPC than Grace and 1.8x sustained per-core performance versus x86 on agentic execution workloads.

#nvidia #vera #ai-infrastructure