Cursor puts GPT-5.5 atop CursorBench at 72.8% and halves price

The headline from Cursor’s latest X post is not just model availability. It is that GPT-5.5 entered the product with both a concrete benchmark claim and a temporary price cut attached. Cursor says GPT-5.5 is now available in the editor, currently ranks first on CursorBench at 72.8%, and is being sold at 50% off through May 2. In a market where many coding-model updates arrive as vague “feels better” claims, that combination is unusually specific.

“It’s currently the top model on CursorBench at 72.8%.”

That sentence comes directly from Cursor’s source tweet. A matching forum thread added the pricing details and clarified the promotion window after users spotted inconsistent dates in the UI. According to Cursor staff, list pricing is $5.00 per million input tokens, $0.50 for cached input, and $30.00 for output; the temporary discount cuts those to $2.50, $0.25, and $15.00 respectively through the end of May 2. That matters because output-token cost is often what makes frontier coding models hard to use at scale.

The more interesting context is CursorBench itself. In Cursor’s March research post, How we compare model quality in Cursor, the company says CursorBench is built from real engineering sessions rather than public repository issues. It argues that the suite tracks actual developer outcomes better than public benchmarks, uses agentic grading, and now covers larger multi-file, tool-using tasks. Cursor also says the current CursorBench-3 task scope has roughly doubled from the initial version and creates more separation among frontier models than saturated public evals.

That does not make 72.8% a neutral industry crown. CursorBench is still an internal benchmark run by the company that sells the product. But it does make the number more relevant than a generic leaderboard screenshot, because the benchmark is explicitly trying to mirror the kinds of underspecified, multi-step tasks developers give coding agents every day. For product users, that is often the right question: not which model wins in abstract, but which one gets more real work over the line inside the tool they already use.

The cursor_ai account usually mixes release notes, agent features, and evaluation methodology, and this post follows that pattern closely. What to watch next is whether independent usage reports match the 72.8% claim, whether GPT-5.5 keeps its lead as other coding agents update, and whether the economics still make sense after the discount ends on May 2. The primary sources are the tweet, Cursor’s forum post, and the CursorBench methodology note.

Cursor puts GPT-5.5 atop CursorBench at 72.8% and halves price

Related Articles

OpenAI puts GPT-5.5 live with 82.7% Terminal-Bench gains

Cursor details Composer 2’s training stack, from continued pretraining to real-world RL

A 145-result coding eval put Kimi K2.6, Opus 4.7, GLM 5.1 and Minimax under LocalLLaMA review

Comments (0)

Leave a Comment

Related Articles

OpenAI puts GPT-5.5 live with 82.7% Terminal-Bench gains

Cursor details Composer 2’s training stack, from continued pretraining to real-world RL
LLM sources.twitter Apr 5, 2026 2 min read

A 145-result coding eval put Kimi K2.6, Opus 4.7, GLM 5.1 and Minimax under LocalLLaMA review
LLM Reddit Apr 19, 2026 2 min read