Cloudflare brings large open-source models to Workers AI, starting with Kimi K2.5

What Cloudflare launched

Cloudflare said on March 19, 2026 that Workers AI now supports frontier-scale open-source models, starting with Moonshot AI's Kimi K2.5. Cloudflare highlighted a 256k context window together with multi-turn tool calling, vision inputs, and structured outputs, positioning the model as a fit for agentic workloads.

The company is using the launch to push a broader platform narrative. By running a larger model directly inside Workers AI, Cloudflare says developers can keep more of the agent lifecycle on one stack, from inference and tool use to state handling and workflow execution inside the wider Cloudflare Developer Platform.

The economics are the bigger story

Cloudflare said Kimi K2.5 is already running inside its internal OpenCode environment and in Bonk, its public code review agent. One internal security review agent processes more than 7B tokens per day and, according to Cloudflare, has caught more than 15 confirmed issues in a single codebase. The company said running that use case on a mid-tier proprietary model would have cost about $2.4M per year, while switching to Kimi on Workers AI cut costs by 77%.

That matters because open-source models are increasingly being judged on production fit, not just benchmark visibility. If a model can handle large context, structured tool use, and high-volume inference while materially reducing cost, it becomes a realistic default for coding, review, and security automation workloads instead of a fallback option.

Why the move matters

The launch suggests the AI infrastructure race is moving from raw model access toward integrated agent stacks. Cloudflare is trying to show that developers do not need a separate orchestration layer to run useful agents at scale if the platform already combines inference, tools, workflows, and deployment primitives. That puts pressure on both proprietary model vendors and competing developer platforms at the same time.

Cloudflare brings large open-source models to Workers AI, starting with Kimi K2.5

What Cloudflare launched

The economics are the bigger story

Why the move matters

Related Articles

Cloudflare brings Kimi K2.5 to Workers AI and says agent coding reviews cut costs by 77%

Cloudflare brings Kimi K2.5 to Workers AI and tunes the stack for agents

Cloudflare Agent Memory stores agent context outside the prompt

Comments (0)

Leave a Comment

Related Articles

Cloudflare brings Kimi K2.5 to Workers AI and says agent coding reviews cut costs by 77%
LLM X/Twitter Mar 22, 2026 2 min read

Cloudflare brings Kimi K2.5 to Workers AI and tunes the stack for agents
LLM X/Twitter Mar 23, 2026 1 min read

Cloudflare Agent Memory stores agent context outside the prompt
LLM X/Twitter Apr 17, 2026 2 min read