The U.S. Department of Defense struck agreements with seven tech companies to deploy AI on its highest-security networks on May 1. Anthropic, which insisted on safety guardrails against autonomous weapons, is conspicuously absent.
#nvidia
RSS FeedThe U.S. Department of Defense finalized AI deployment agreements with OpenAI, Google, Microsoft, AWS, NVIDIA, SpaceX, Reflection AI, and Oracle for its most classified networks. Anthropic was excluded after refusing to allow Claude to be used for purposes including autonomous weapons and mass surveillance.
A LocalLLaMA community member completed a 16-node DGX Spark cluster with 200 Gbps networking, optimized for unified-memory LLM inference and planning tests with DeepSeek and Kimi models.
NVIDIA is targeting the cost bottleneck in multimodal agents, not just the demo factor. Nemotron 3 Nano Omni claims up to 9x higher throughput, a 256K context window, and six leaderboard wins for document, video, and audio understanding.
Swedish legal AI startup Legora raised $600M total in its Series D, reaching a $5.6B valuation with Nvidia NVentures making its first-ever legal AI investment. The company crossed $100M ARR between the two tranche closings, setting up a direct rivalry with Harvey.
Multimodal agents still pay a tax for chaining separate vision, audio, and text models. NVIDIA says Nemotron 3 Nano Omni collapses that stack into a 30B model with 256K context and up to 9.2x higher effective video system capacity at the same responsiveness target.
This is less about one more cloud partnership and more about the infrastructure shape of the next agent wave. NVIDIA and Google Cloud say A5X Rubin systems can scale to 80,000 GPUs per site and 960,000 across multisite clusters, while cutting inference cost per token and boosting token throughput per megawatt by up to 10x versus the prior generation.
NVIDIA released Nemotron-Personas-Korea on Hugging Face with 7 million synthetic personas grounded in Korean public statistics. The dataset matters because agent localization is no longer only translation; it needs region, honorifics, occupations, and public-service context.
Why it matters: post-training agents increasingly depend on reinforcement learning throughput, not only inference speed. NVIDIA says NeMo RL’s FP8 path speeds RL workloads by 1.48x on Qwen3-8B-Base while tracking BF16 accuracy.
Why it matters: NVIDIA is aiming generative video research at simulation-ready 3D environments rather than short clips. The tweet says Lyra 2.0 maintains per-frame 3D geometry and uses self-augmented training, while the project page shows outputs as Gaussian splats and meshes that can be exported to Isaac Sim.
Coding agents are being tested on GPU performance work, not just app scaffolding. Cursor says its NVIDIA collaboration produced a 38% geomean speedup across 235 CUDA kernel problems in three weeks.
Why it matters: NVIDIA is turning quantum calibration and error correction into an open model-and-tooling stack instead of a lab-only workflow. The April 14 tweet framed Ising as an open suite, and NVIDIA’s technical post says Ising Calibration 1 scored 14.5% above GPT-5.4 and 3.27% above Gemini 3.1 Pro on QCalEval.