Perplexity launches Agent API as a managed runtime for search and tool-using workflows

Original: Agent API: A Managed Runtime for Agentic Workflows View original →

Read in other languages: 한국어日本語
LLM Mar 16, 2026 By Insights AI 2 min read Source

What Perplexity launched

On March 11, 2026, Perplexity introduced the Agent API, describing it as a managed runtime for agentic workflows with integrated search, tool execution, and multi-model orchestration. Perplexity’s framing is important: instead of asking developers to assemble a model router, search layer, embeddings provider, sandbox service, and monitoring stack on their own, the company says those pieces can now be reached through a single integration point.

The product is designed around the agent loop rather than a single model call. Perplexity says the runtime can take an objective, decompose it into steps, decide which tools to use, execute those tools, inspect the results, and iterate until it has a grounded answer. In the company’s example, an agent preparing for a sales call can query an internal CRM, run web searches, fetch specific pages, and then synthesize that internal and external context into one response.

What ships in the first version

The first release includes two built-in tools, web_search and fetch_url. Perplexity says web_search supports domain allowlists and denylists, recency and date-range filters, language filters, and configurable content budgets per page. fetch_url retrieves and extracts the full contents of specific pages. The API also supports custom functions, which means the built-in search stack can be connected to private applications, databases, or internal business systems.

Perplexity is also pushing a model-agnostic strategy. The company says the Agent API works across frontier model providers and supports fallback chains, so a request can automatically move to the next model if the preferred one is unavailable. Perplexity’s documentation separately says the platform exposes third-party frontier models at direct provider rates with no markup. That combination of tool use, provider abstraction, and high-availability routing is the clearest technical signal in the launch.

Why this matters

Many companies talking about agents still ship either a chat surface or a thin inference router. Perplexity is trying to sell the execution layer itself. That matters because production agent systems usually fail at orchestration before they fail at generation. The hard parts are planning, retrieval, tool invocation, fallback behavior, and deciding when another step is necessary.

Perplexity is also packaging its internal tuning work as presets. The company says presets expose preconfigured combinations of system prompt, tool set, and cost profile for workloads such as quick factual lookup, balanced research, and deeper multi-source analysis. If that holds up in practice, smaller teams can buy a working runtime instead of building and maintaining the full agent stack themselves. The broader implication is that Perplexity is moving from a consumer answer product toward infrastructure for search-heavy agent workflows.

Sources: Perplexity blog · Perplexity docs

Share: Long

Related Articles

LLM sources.twitter 4d ago 2 min read

OpenAI Developers published a March 11, 2026 engineering write-up explaining how the Responses API uses a hosted computer environment for long-running agent workflows. The post centers on shell execution, hosted containers, controlled network access, reusable skills, and native compaction for context management.

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.