#cloudflare

AI Apr 19, 2026 2 min read

Cloudflare finds agent-ready standards nearly absent across top sites

Cloudflare scanned 200,000 high-traffic domains and found only 4% declaring AI usage preferences, 3.9% serving Markdown content negotiation, and fewer than 15 exposing MCP Server Cards or API Catalogs. Its Agent Readiness score turns the agent web into an audit checklist.

#agents #cloudflare #web-standards

AI Apr 19, 2026 2 min read

Cloudflare turns canonical tags into 301s for AI crawlers

Cloudflare is giving paid-plan sites a way to turn canonical tags into HTTP 301 redirects for verified AI training crawlers. The move matters because Cloudflare saw 4.8M AI crawler visits to its own developer docs in 30 days, with deprecated pages consumed at the same rate as current ones.

#cloudflare #ai-crawlers #web

AI sources.twitter Apr 18, 2026 2 min read

Cloudflare Unweight cuts Llama bundles 22% with lossless GPU kernels

Why it matters: Cloudflare is attacking the memory-bandwidth bottleneck in LLM serving rather than only buying more GPUs. Its post reports 15-22% model-size reduction, about 3 GB VRAM saved on Llama 3.1 8B, and open-sourced GPU kernels.

#cloudflare #llm-inference #gpu

LLM sources.twitter Apr 17, 2026 2 min read

Cloudflare Agent Memory stores agent context outside the prompt

Why it matters: long-running agents need memory that survives beyond one prompt without replaying every message. Cloudflare says Agent Memory is in private beta and keeps useful state available without filling the context window.

#cloudflare #agents #memory

LLM Hacker News Apr 17, 2026 1 min read

Cloudflare’s agent inference layer met HN’s plumbing test

HN focused on the plumbing question: does a 14-plus-provider inference layer actually make agent apps easier to operate? Cloudflare framed AI Gateway, Workers AI bindings, and a broader multimodal catalog as one platform, while commenters compared it with OpenRouter and pressed on pricing accuracy, catalog overlap, and deployment trust.

#cloudflare #agents #inference

LLM Apr 17, 2026 2 min read

Cloudflare cuts Kimi K2.5 token latency to 20-30 ms

Cloudflare says Workers AI has made Kimi K2.5 3x faster for agent workloads. The technical change pushed p90 time per token from roughly 100 ms to 20-30 ms and raised peak input-token cache hit ratios from 60% to 80% with heavy internal users.

#cloudflare #inference #kimi

LLM Apr 17, 2026 2 min read

Cloudflare gives agents BM25, vectors and per-customer search

Cloudflare is turning AutoRAG into AI Search, a retrieval primitive agents can create and query from Workers. The open beta adds BM25 plus vector search, built-in storage and index, metadata boosting, and cross-instance search with concrete free and paid limits.

#cloudflare #rag #agents

AI Hacker News Apr 17, 2026 2 min read

Cloudflare Email Service Turns HN Toward the Old Problems: SMTP, Deliverability, and Spam

HN treated Cloudflare Email Service less as agent magic and more as a new email sender entering a hostile protocol world. The thread focused on Workers integration, SES alternatives, spam pressure, MTA-STS, and sending limits.

#cloudflare #agents #email

LLM Apr 16, 2026 2 min read

Cloudflare turns AI Gateway into one API for 70+ models

Cloudflare is trying to make model choice less sticky: AI Gateway now routes Workers AI calls to 70+ models across 12+ providers through one interface. For agent builders, the important part is not the catalog alone but spend controls, retry behavior, and failover in workflows that may chain ten inference calls for one task.

#cloudflare #llm #agents

LLM Apr 15, 2026 2 min read

Cloudflare says its MCP design cuts token use by 99.9%

Cloudflare is packaging an enterprise playbook for MCP at the moment companies are wiring agents into internal systems. The headline number is a 99.9% token reduction from its Code Mode design, alongside new Shadow MCP detection for unauthorized remote servers.

#cloudflare #mcp #security

LLM Apr 15, 2026 2 min read

Cloudflare gives AI agents a real computer with Sandboxes GA

Cloudflare is moving agent infrastructure out of demo mode: Sandboxes and Containers are now generally available, with 7 recent upgrades aimed at persistent coding workflows. The stack now bundles PTY terminals, credential injection, stateful interpreters, background processes, file watching, snapshots, and higher limits.

#cloudflare #agents #containers

LLM Apr 14, 2026 2 min read

Cloudflare cuts MCP token drag with Code Mode and hunts shadow servers

MCP is moving from developer convenience to enterprise control problem. Cloudflare's new architecture matters because it tackles both parts of that shift at once: bloated tool schemas and the security mess created by ungoverned local servers.

#cloudflare #mcp #agents