Claude agents closed 186 office deals in Anthropic's market test

The useful part of Anthropic’s latest research is not that Claude chatted convincingly. It is that the company put agent behavior inside a real market and let it run. In a one-week internal experiment called Project Deal, Anthropic asked Claude-powered agents to buy, sell, and negotiate on behalf of employees in its San Francisco office. In the run that counted for actual exchanges, 69 agents closed 186 deals across more than 500 listed items, with total transaction value a little above $4,000.

“We created a marketplace for employees in our San Francisco office”

That line came from Anthropic’s source tweet, which linked to the full Project Deal write-up. The setup matters. Participants told Claude what they might want to buy or sell, and each agent received a $100 budget. The market itself ran in Slack, and once the experiment started there was no human sign-off on bids, counteroffers, or final agreements. The agents had to find matches, negotiate in natural language, and decide when to close.

Anthropic did not run the study only once. It operated four parallel versions of the marketplace. Two runs used Claude Opus 4.5 for every participant, while two other runs mixed in Claude Haiku 4.5 for half of the participants. Anthropic’s follow-up comments and the research page both point in the same direction: higher-quality models had a real advantage. That matters because it turns “better model” into something more concrete than a benchmark delta. In an agent economy, model quality can show up as better pricing, better deal selection, and higher completion rates.

The experiment also exposed rough edges rather than hiding them. Some agents tried to lowball. Some leaned too hard toward being liked. Others had to juggle user preferences against market incentives. Anthropic’s page argues that agent marketplaces could be useful, but also warns that once agents start operating in more adversarial settings, optimization pressure may produce stranger and less human-friendly behavior. That is a stronger takeaway than a simple product teaser because it shows both feasibility and the coordination problems that arrive with it.

The AnthropicAI account usually mixes Claude launches, safety policy, and research notes; this post sits firmly in the research bucket. What to watch next is whether Anthropic expands Project Deal into harsher environments, publishes more failure cases, or measures how often stronger agents dominate weaker ones. For now, the clearest signal is that delegated negotiation is no longer theoretical. The primary sources are the tweet and Anthropic’s full write-up.

Claude agents closed 186 office deals in Anthropic's market test

Related Articles

Anthropic revisits a multi-agent Claude harness for long-running software engineering

Claude turns the advisor pattern into a native tool on Claude Platform

Anthropic launches Claude Managed Agents to move production agents onto hosted infrastructure

Comments (0)

Leave a Comment

Related Articles

Anthropic revisits a multi-agent Claude harness for long-running software engineering
LLM sources.twitter Mar 28, 2026 2 min read

Claude turns the advisor pattern into a native tool on Claude Platform
LLM sources.twitter Apr 11, 2026 2 min read

Anthropic launches Claude Managed Agents to move production agents onto hosted infrastructure
LLM sources.twitter Apr 12, 2026 2 min read