Arena puts GPT-5.5 at #2 in search and +50 in Code Arena

Arena.ai’s April 27 X post is one of the first broad external scorecards for GPT-5.5 after OpenAI shipped the model on April 23. That matters because launch threads usually tell you what a lab wants the model to be known for. Community evaluation tells you where it actually lands when people compare it against rivals across different tasks.

“Code Arena: #9, a strong +50pt jump over GPT-5.4 … Search Arena: #2 … Expert Arena: #5.”

The Arena account, formerly LMArena, usually posts community-driven benchmark updates across text, search, vision, and coding. This thread is valuable precisely because it is not a single vanity metric. The breakdown is mixed but informative: GPT-5.5 ranks #6 in Document Arena, #7 in Text Arena, #3 in Math, #8 in Instruction Following, #5 in Vision, and #2 in Search. That profile suggests a model that improved broadly, but not one that simply swept every leaderboard on arrival.

The coding result is the easiest number to misread. A #9 rank does not sound dramatic on its own, but the thread says GPT-5.5 gained 50 points versus GPT-5.4 in Code Arena, which measures agentic web-development tasks. In other words, the model appears meaningfully stronger than its predecessor even though it still trails the very top tier. The same thread also points to a #5 finish in Expert Arena, which is a better fit for users who care about hard professional prompts than for those who only care about casual chatbot feel.

What to watch next is whether these placements hold once more samples arrive and whether higher-reasoning configurations move the coding rank upward. The current takeaway is not “GPT-5.5 won everything.” It is that OpenAI’s new model looks more balanced than the launch hype alone could prove, with especially clear movement in coding and search.

Arena puts GPT-5.5 at #2 in search and +50 in Code Arena

Related Articles

OpenAI puts GPT-5.5 live with 82.7% Terminal-Bench gains

HN’s GPT-5.5 read: the real question is whether it finishes the job

GPT-5.5 pushes agentic coding higher without adding latency

Comments (0)

Leave a Comment

Related Articles

OpenAI puts GPT-5.5 live with 82.7% Terminal-Bench gains

HN’s GPT-5.5 read: the real question is whether it finishes the job

GPT-5.5 pushes agentic coding higher without adding latency