Together AI’s $800M round turns open-model inference into a scale race
Original: Together AI Raises $800 Million at $8.3 Billion Valuation to Make Frontier AI Accessible to All View original →
The fight over AI margins is moving from model leaderboards to the infrastructure layer that serves them. Together AI’s $800 million Series C gives the open-model cloud provider an $8.3 billion post-money valuation and turns cheap, high-throughput inference into one of the week’s clearest AI infrastructure stories.
According to Together AI’s Business Wire release, Aramco Ventures led the round, with Vista Equity Partners, General Catalyst, Emergence Capital, NVIDIA, March Capital, Pegatron, S Ventures and others participating. The company positions itself as an AI-native cloud for training and running open-source models such as DeepSeek, Nemotron, MiniMax and Kimi.
The traction numbers explain why the round matters. Together AI says annual bookings crossed $1.15 billion last quarter, with thousands of paying customers and AI-native companies including Cursor, Cognition and Decagon. It also says customers report 6x to 60x savings versus closed-model pricing for equal or better performance, and cites Decagon cutting inference costs sixfold after moving workloads to Together.
The new capital is aimed at products, features, inference capacity and a much larger infrastructure footprint. Together AI says it expects capacity to grow roughly 50-fold over the next five years. That target puts the company in the same practical contest as hyperscalers and specialized neoclouds: not just access to GPUs, but enough optimized serving capacity to make production AI economics work.
The larger implication is that open-source AI is becoming an enterprise cost strategy, not just a developer preference. If companies can swap some closed-model calls for open models without losing task quality, the provider that makes those models cheap and reliable gets strategic weight. The next metric to watch is whether Together can keep the cost advantage as usage grows and as closed-model providers respond with lower prices, bundled enterprise contracts and their own inference optimizations.
Related Articles
Enterprise AI bottlenecks are shifting from model access to operational control. NVIDIA says its internal Enterprise Inference Hub serves more than 100 model endpoints and processes trillions of tokens every week.
8090 Labs has raised a $135M Series A led by Salesforce Ventures for an AI coding agent aimed at corporate programming teams. The pitch is not casual vibe coding, but production software with audit trails and enterprise controls.
Etched came out of stealth with a working chip, $800 million raised and more than $1 billion in signed customer contracts. The bigger signal is that AI inference is becoming a full-stack systems race, not just a hunt for more general-purpose GPUs.