Cloudflare brings Kimi K2.5 to Workers AI and tunes the stack for agents

Original: Kimi K2.5 is now on Workers AI, helping you power agents entirely on Cloudflare’s Developer Platform. Learn how we optimized our inference stack and reduced inference costs for internal agent use cases. https://t.co/kEQ6HHpoJS View original →

LLM Mar 23, 2026 By Insights AI 1 min read 56 views Source

On March 19, 2026, Cloudflare said on X that Moonshot AI’s Kimi K2.5 is now available on Workers AI. In the linked blog post, Cloudflare says Workers AI is officially entering the “big models” tier by offering frontier open-source models directly on its inference platform, starting with Kimi K2.5.

Cloudflare highlights why it picked this model for agentic work. The company says Kimi K2.5 offers a 256k context window plus multi-turn tool calling, vision inputs, and structured outputs, which makes it suitable for long-running, stateful agent flows. The broader pitch is that developers can now keep the entire agent lifecycle on a single platform by combining the model with Cloudflare primitives such as Durable Objects for state, Workflows for long-running tasks, and sandboxed execution surfaces for tools.

Cloudflare says it built custom kernels for Kimi K2.5 on top of its Infire inference engine.
Workers AI now surfaces cached tokens as a usage metric and discounts cached tokens relative to fresh input tokens.
A new `x-session-affinity` header is meant to improve prefix-cache hit rates and reduce latency and cost in multi-turn agent sessions.

The important part is not just model availability. Many platforms can host a model card. Cloudflare is trying to differentiate by packaging serving optimizations, stateful primitives, and agent infrastructure into the same stack. That matters for teams that want an open-source frontier model without taking on the operational burden of self-hosting, tuning kernels, or building their own cache-aware routing layer for long-context workflows.

Cloudflare also says the Agents SDK starter now uses Kimi K2.5 as its default model, which signals that the launch is meant to feed directly into agent developer workflows rather than sit as a generic catalog entry. The original X post is here, and the detailed launch post is on Cloudflare.

LLM Apr 11, 2026 2 min read

Cloudflare brings Kimi K2.5 to Workers AI and pushes deeper into agent infrastructure

Cloudflare moved Workers AI into larger-model territory on March 19, 2026 by adding Moonshot AI’s Kimi K2.5. The company is pitching a single stack for durable agent execution, large-context inference, and lower-cost open-model deployment.

#cloudflare #workers-ai #kimi-k2.5

LLM X/Twitter Mar 22, 2026 2 min read

Cloudflare brings Kimi K2.5 to Workers AI and says agent coding reviews cut costs by 77%

Cloudflare said on March 20, 2026 that Kimi K2.5 was available on Workers AI so developers could build end-to-end agents on Cloudflare’s platform. Its launch post says the model brings a 256k context window, multi-turn tool calling, vision inputs, and structured outputs, while an internal security-review agent processing 7B tokens per day cut costs by 77% after the switch.

#cloudflare #workers-ai #kimi-k2-5

LLM X/Twitter Mar 23, 2026 2 min read

Cloudflare brings Kimi K2.5 to Workers AI and shows how it cut internal agent costs

Cloudflare said on March 20, 2026 that Kimi K2.5 is now available on Workers AI so developers can run agents end-to-end on its platform. The linked Cloudflare blog says the model ships with a 256K context window, multi-turn tool calling, vision, and structured outputs, and that one internal agent workload cut costs by 77% after the switch.

#cloudflare #workers-ai #kimi-k2.5

Cloudflare brings Kimi K2.5 to Workers AI and tunes the stack for agents

Related Articles

Cloudflare brings Kimi K2.5 to Workers AI and pushes deeper into agent infrastructure

Cloudflare brings Kimi K2.5 to Workers AI and says agent coding reviews cut costs by 77%

Cloudflare brings Kimi K2.5 to Workers AI and shows how it cut internal agent costs

Comments (0)

Leave a Comment