Cloudflare brings large open-source models to Workers AI, starting with Kimi K2.5
Original: Powering the agents: Workers AI now runs large models, starting with Kimi K2.5 View original →
What Cloudflare launched
Cloudflare said on March 19, 2026 that Workers AI now supports frontier-scale open-source models, starting with Moonshot AI's Kimi K2.5. Cloudflare highlighted a 256k context window together with multi-turn tool calling, vision inputs, and structured outputs, positioning the model as a fit for agentic workloads.
The company is using the launch to push a broader platform narrative. By running a larger model directly inside Workers AI, Cloudflare says developers can keep more of the agent lifecycle on one stack, from inference and tool use to state handling and workflow execution inside the wider Cloudflare Developer Platform.
The economics are the bigger story
Cloudflare said Kimi K2.5 is already running inside its internal OpenCode environment and in Bonk, its public code review agent. One internal security review agent processes more than 7B tokens per day and, according to Cloudflare, has caught more than 15 confirmed issues in a single codebase. The company said running that use case on a mid-tier proprietary model would have cost about $2.4M per year, while switching to Kimi on Workers AI cut costs by 77%.
That matters because open-source models are increasingly being judged on production fit, not just benchmark visibility. If a model can handle large context, structured tool use, and high-volume inference while materially reducing cost, it becomes a realistic default for coding, review, and security automation workloads instead of a fallback option.
Why the move matters
The launch suggests the AI infrastructure race is moving from raw model access toward integrated agent stacks. Cloudflare is trying to show that developers do not need a separate orchestration layer to run useful agents at scale if the platform already combines inference, tools, workflows, and deployment primitives. That puts pressure on both proprietary model vendors and competing developer platforms at the same time.
Related Articles
Perplexity said on March 11, 2026 that its Sandbox API will become both an Agent API tool and a standalone service. Existing docs already frame Agent API as a multi-provider interface with explicit tool configuration, so the update pushes code execution closer to a first-class orchestration primitive.
Perplexity said on March 13, 2026 that Perplexity Computer is now available on mobile, starting with iOS inside the Perplexity app. Coming one day after the company opened Computer to Pro subscribers, the update turns the product into a more explicit cross-device agent workflow rather than a desktop-only experience.
On March 9, 2026, OpenAI said it plans to acquire Promptfoo and integrate its AI security tooling into OpenAI Frontier. The move pushes security testing, red-teaming, and governance closer to the default workflow for enterprise agents.
Comments (0)
No comments yet. Be the first to comment!