#gpt-5-5

LLM X/Twitter May 28, 2026 2 min read

DeepSWE’s 113 tasks put GPT-5.5 at 70% and Claude Opus 4.7 at 54%

DeepSWE reframes coding-agent evaluation with 113 original tasks across 91 repositories. Its first board gives GPT-5.5 a 70.0% pass@1 score, versus 54.2% for Claude Opus 4.7.

#deepswe #coding-agents #benchmark

AI May 17, 2026 1 min read

OpenAI Opens GPT-5.5-Cyber to EU Defenders While Anthropic Holds Back on Mythos

OpenAI's new EU Cyber Action Plan grants vetted European cybersecurity teams access to GPT-5.5-Cyber, a defensive variant of its latest model tuned for vulnerability research and malware analysis. Anthropic continues to restrict its more powerful Mythos model over exploitation concerns.

#openai #cybersecurity #eu

LLM Apr 27, 2026 2 min read

GitHub Copilot gets GPT-5.5, but the 7.5x multiplier changes the math

GitHub is rolling GPT-5.5 into Copilot across IDEs, CLI, mobile, github.com, and the cloud agent, turning OpenAI's latest model into a daily coding option instead of a release-note headline. The catch is a 7.5x premium request multiplier, and Business or Enterprise admins must explicitly enable access.

#github #copilot #gpt-5-5

LLM Apr 26, 2026 2 min read

Cursor puts GPT-5.5 atop CursorBench at 72.8% and halves price

Why it matters: public coding benchmarks are getting less useful at the frontier, so a fresh product-side score can move developer attention fast. Cursor says GPT-5.5 is now its top model on CursorBench at 72.8% and is discounting usage by 50% through May 2.

#cursor #gpt-5-5 #benchmarks

LLM Hacker News Apr 26, 2026 2 min read

HN Meets GPT-5.5 API With a Price-and-Behavior Audit, Not a Victory Lap

HN did not greet GPT-5.5 with applause first. The thread went straight to pricing, context tiers, and whether the model actually behaves better once real coding work starts.

#openai #gpt-5-5 #api

AI X/Twitter Apr 25, 2026 2 min read

GPT-5.5 reaches the API with fewer retries and higher efficiency

Why it matters: API availability is the moment a flagship model becomes something teams can actually wire into products. OpenAI’s developer account says GPT-5.5 brings fewer retries, and the official release page now lists API access with a 1M context window and updated pricing.

#openai #api #gpt-5-5

AI X/Twitter Apr 25, 2026 2 min read

GitHub Copilot rolls out GPT-5.5 for complex agentic coding

Why it matters: model launches become more consequential when they land in tools developers already use every day. GitHub says early testing found GPT-5.5 strongest on complex multi-step coding tasks, and the rollout ships with a 7.5x premium request multiplier.

#github #copilot #gpt-5-5

LLM X/Twitter Apr 25, 2026 2 min read

OpenAI puts GPT-5.5 live with 82.7% Terminal-Bench gains

OpenAI is pushing harder into agentic work, not just chat. On the company's own evals, GPT-5.5 reaches 82.7% on Terminal-Bench 2.0, beats GPT-5.4 by 7.6 points, and uses fewer tokens in Codex.

#openai #gpt-5-5 #codex

LLM X/Twitter Apr 23, 2026 2 min read

GPT-5.5 jumps 3 points clear on Artificial Analysis, but cost rises 20%

Why it matters: this is one of the first external benchmark reads to land right after the GPT-5.5 launch. Artificial Analysis said GPT-5.5 moved 3 points clear on its Intelligence Index, while the full index run still became roughly 20% more expensive.

#gpt-5-5 #artificial-analysis #benchmarks