Cloudflare is trying to make model choice less sticky: AI Gateway now routes Workers AI calls to 70+ models across 12+ providers through one interface. For agent builders, the important part is not the catalog alone but spend controls, retry behavior, and failover in workflows that may chain ten inference calls for one task.
#developers
RSS FeedOpenAI’s updated Agents SDK adds a model-native harness and native sandbox execution so agents can inspect files, run commands, edit code, and continue across longer tasks. It launches generally available in Python with support for sandbox providers including Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel.
Amjad Masad said on April 10, 2026 that Accenture is investing in Replit, adopting it internally, and partnering to bring secure vibecoding to enterprises globally. The public first-party disclosure is brief, but the combination of capital, internal deployment, and access to a consulting organization with 700,000-plus employees makes this a meaningful enterprise distribution move for AI-assisted software creation.
OpenAI said on March 24, 2026 that it is publishing prompt-based teen-safety policies designed for gpt-oss-safeguard and other reasoning models. The initial release covers six risk areas and was developed with input from Common Sense Media and everyone.ai.
Google introduced Gemini 3.1 Flash-Lite on Mar 03, 2026 as its fastest and lowest-cost Gemini 3 series model. The preview release targets high-volume developer workloads with lower pricing, faster latency, and stronger benchmark scores than the prior 2.5 Flash tier.
Google introduced Gemini 3.1 Flash-Lite on March 3, 2026 as its fastest and most cost-efficient Gemini 3 series model. The model is rolling out in preview through the Gemini API in Google AI Studio and Vertex AI, with pricing of $0.25/1M input tokens and $1.50/1M output tokens, plus claims of a 2.5x faster Time to First Answer Token and 45% higher output speed than 2.5 Flash.
Google introduced Project Spend Caps, revamped Usage Tiers, and new billing dashboards for Gemini API developers in AI Studio. The update is aimed at making cost control and scaling behavior more predictable for teams moving into paid usage.
On March 5, 2026, OpenAI introduced GPT-5.4 as a flagship model focused on relevance, contextual understanding, and instruction following. In the API, it pairs a 1M-token context window with stronger tool search for long, multi-tool workflows.
OpenAI Developers posted on March 12, 2026 that the Video API now supports a broader Sora 2 workflow. The update adds reusable characters, video extensions, longer clips, portrait and landscape exports, and batch processing for studio-style pipelines.
Google introduced the Developer Knowledge API and an open-source MCP Server on February 4, 2026. The tools are meant to connect internal documentation, public URLs, and other team knowledge sources to Gemini Code Assist and AI-agent workflows with less custom plumbing.
Google DeepMind said Gemini 3.1 Flash-Lite is rolling out in preview through the Gemini API and Google AI Studio. The company positioned it as the most cost-efficient Gemini 3 model, with lower price, faster performance, and tunable thinking levels.
Claude said Claude Code now includes Code Review, a feature that dispatches multiple agents on every pull request. Anthropic says the feature is in research preview for Team and Enterprise, with depth-first reviews rather than lightweight skims.