#openai

LLM 21h ago 2 min read

Cursor puts GPT-5.5 atop CursorBench at 72.8% and halves price

Why it matters: public coding benchmarks are getting less useful at the frontier, so a fresh product-side score can move developer attention fast. Cursor says GPT-5.5 is now its top model on CursorBench at 72.8% and is discounting usage by 50% through May 2.

#cursor #gpt-5-5 #benchmarks

LLM Hacker News 1d ago 2 min read

HN Meets GPT-5.5 API With a Price-and-Behavior Audit, Not a Victory Lap

HN did not greet GPT-5.5 with applause first. The thread went straight to pricing, context tiers, and whether the model actually behaves better once real coding work starts.

#openai #gpt-5-5 #api

AI sources.twitter 1d ago 2 min read

GPT-5.5 reaches the API with fewer retries and higher efficiency

Why it matters: API availability is the moment a flagship model becomes something teams can actually wire into products. OpenAI’s developer account says GPT-5.5 brings fewer retries, and the official release page now lists API access with a 1M context window and updated pricing.

#openai #api #gpt-5-5

LLM sources.twitter 1d ago 2 min read

OpenAI puts GPT-5.5 live with 82.7% Terminal-Bench gains

OpenAI is pushing harder into agentic work, not just chat. On the company's own evals, GPT-5.5 reaches 82.7% on Terminal-Bench 2.0, beats GPT-5.4 by 7.6 points, and uses fewer tokens in Codex.

#openai #gpt-5-5 #codex

LLM 2d ago 2 min read

GPT-5.5 pushes agentic coding higher without adding latency

OpenAI is pitching GPT-5.5 as more than a routine model refresh. With 82.7% on Terminal-Bench 2.0, 58.6% on SWE-Bench Pro, and a claim that it keeps GPT-5.4-level latency, the company is resetting expectations for long-running coding agents.

#openai #gpt-5.5 #codex

AI 2d ago 2 min read

ChatGPT for Clinicians goes free in the U.S. as doctor AI use hits 72%

OpenAI is moving from generic chat to a healthcare-specific workspace, and the timing is clear: 72% of physicians now report AI use in clinical practice. The new product is free to verified U.S. physicians, NPs, PAs, and pharmacists, and OpenAI says doctors rated 99.6% of tested responses safe and accurate across 6,924 conversations.

#openai #healthcare #clinicians

AI 2d ago 2 min read

OpenAI ships Privacy Filter, a 1.5B open model for local PII masking

Privacy tooling usually breaks at scale or forces raw text onto a server. OpenAI’s 1.5B open-weight Privacy Filter runs locally, handles 128,000-token inputs, and posts 97.43% F1 on a corrected PII-Masking-300k benchmark.

#openai #privacy #security

LLM Hacker News 3d ago 2 min read

HN’s GPT-5.5 read: the real question is whether it finishes the job

HN treated GPT-5.5 less like another model launch and more like a test of whether AI can actually carry messy computer tasks to completion. The discussion kept drifting from benchmarks to rollout timing, API access, and whether the gains show up in real coding work.

#openai #gpt-5.5 #agentic-coding

AI 3d ago 2 min read

OpenAI puts GPT-5.5 bio jailbreaks on bounty with a $25,000 prize

OpenAI is attaching cash to the hardest kind of safety failure: a single prompt that breaks all five of its bio safeguards. The new GPT-5.5 Bio Bug Bounty pays $25,000 for a universal jailbreak, limits testing to GPT-5.5 in Codex Desktop, and starts formal testing on April 28.

#openai #gpt-5.5 #biosecurity

LLM 3d ago 2 min read

Codex crosses 4 million weekly developers as OpenAI builds its services channel

This is a distribution story, not just a usage milestone. OpenAI says Codex grew from more than 3 million weekly developers in early April to more than 4 million two weeks later, and it is pairing that demand with Codex Labs plus seven global systems integrators to turn pilots into production rollouts.

#openai #codex #enterprise

LLM 3d ago 2 min read

Responses API WebSockets make OpenAI agent loops up to 40% faster

The bottleneck moved from GPUs to the API layer, and OpenAI changed the transport to keep up. By adding WebSocket mode and connection-scoped caching to the Responses API, the company says agentic workflows improved by up to 40% end-to-end and GPT-5.3-Codex-Spark reached 1,000 tokens per second with bursts up to 4,000.

#openai #responses-api #websockets

AI 3d ago 2 min read

OpenAI’s Privacy Filter runs locally with 128K context and 97.43% corrected F1

The important shift is architectural: teams can mask sensitive text before it ever leaves the machine. OpenAI’s 1.5B-parameter Privacy Filter supports 128,000 tokens and scored 97.43% F1 on a corrected version of the PII-Masking-300k benchmark.

#openai #privacy #pii