Harvard Study in Science: OpenAI's o1 Outperforms ER Physicians on Diagnostic Accuracy

Study Overview

A peer-reviewed study from Harvard Medical School and Beth Israel Deaconess Medical Center, published in Science, found that OpenAI's o1 model outperformed two attending physicians in diagnosing real emergency room cases.

Key Numbers

76 real ER triage cases evaluated
OpenAI o1 exact or near-exact diagnoses: 67%
Two internal medicine physicians: 55% and 50%
On 5 detailed clinical case studies: o1 scored 89% vs. 46 doctors using conventional search tools at 34%

Methodology

Both the model and physicians received identical, unprocessed EHR data as text. No additional images or lab data were provided, mirroring actual clinical information availability.

Significance and Caveats

Researchers emphasized augmentation over replacement — AI as a second-opinion tool for time-pressured ER clinicians. The 76-case sample size is too small for regulatory approval, and further studies covering rare diseases and complex comorbidities are needed before clinical deployment.

Source: TechCrunch

Sciences 2h ago 1 min read

OpenAI says Astra produced 10 new math and TCS results

OpenAI says an internal version of Astra produced new results on 10 long-running problems in mathematics and theoretical computer science. The company released manuscripts, reasoning walkthroughs, and Lean certificates, and estimated discovery compute at about $2,000 at Sol API rates.

#openai #mathematics #lean

Sciences Apr 14, 2026 2 min read

OpenAI Says ChatGPT Is Becoming a Scientific Collaborator

OpenAI says ChatGPT is already being used at research scale across science and mathematics. In its January 2026 report, the company says advanced science and math usage reached nearly 8.4 million weekly messages from roughly 1.3 million weekly users, with early evidence that GPT-5.2 is contributing to serious mathematical work.

#openai #science #chatgpt

Sciences Reddit May 2, 2026 1 min read

LLMs Match or Exceed ER Physicians in Diagnostic Tasks, Science Study Finds

A new study published in Science found that a state-of-the-art LLM matched or exceeded human emergency physicians in diagnostic choices, emergency triage, and next-step management decisions using real ER data and hundreds of physician comparisons. Researchers say the results call for collaborative care models, not AI replacement of doctors.

#ai-medicine #healthcare #llm