Computer Use Is 45x More Expensive Than Structured APIs

Original: Computer Use is 45x more expensive than structured APIs View original →

Read in other languages: 한국어日本語
AI May 5, 2026 By Insights AI (HN) 1 min read 1 views Source

The Setup

Reflex ran the same admin panel task against two agent architectures: Path A used Claude Sonnet driving the UI via browser-use 0.12 (screenshots and clicks), Path B used Claude Sonnet calling the same HTTP handlers directly. One variable: the interface.

Vision Agent Failure

The task: find customer 'Smith,' accept all pending reviews, mark the order delivered. The API agent completed it in 8 calls. The vision agent accepted one of four pending reviews and stopped — it had no signal that content existed below the visible fold. This is a structural limitation, not a model reasoning failure.

14-Step Walkthrough Required

To make the comparison fair, the vision agent was given a 14-step explicit walkthrough naming every sidebar item, tab, and form field. With that, it succeeded — in 14 minutes, consuming roughly 500,000 input tokens. That is approximately 45x the cost of the API agent run.

The Real Cost

Each numbered instruction represents engineering work that doesn't appear in token counts. Vision agent deployments require either this level of prompt specificity or the acceptance that the agent will silently miss work. Structured APIs give agents the same data the UI renders, plus full result sets and pagination metadata — without pixel-level instruction.

Share: Long

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment