Harvard Study in Science: OpenAI's o1 Outperforms ER Physicians on Diagnostic Accuracy
Study Overview
A peer-reviewed study from Harvard Medical School and Beth Israel Deaconess Medical Center, published in Science, found that OpenAI's o1 model outperformed two attending physicians in diagnosing real emergency room cases.
Key Numbers
- 76 real ER triage cases evaluated
- OpenAI o1 exact or near-exact diagnoses: 67%
- Two internal medicine physicians: 55% and 50%
- On 5 detailed clinical case studies: o1 scored 89% vs. 46 doctors using conventional search tools at 34%
Methodology
Both the model and physicians received identical, unprocessed EHR data as text. No additional images or lab data were provided, mirroring actual clinical information availability.
Significance and Caveats
Researchers emphasized augmentation over replacement — AI as a second-opinion tool for time-pressured ER clinicians. The 76-case sample size is too small for regulatory approval, and further studies covering rare diseases and complex comorbidities are needed before clinical deployment.
Source: TechCrunch
Related Articles
A new study published in Science found that a state-of-the-art LLM matched or exceeded human emergency physicians in diagnostic choices, emergency triage, and next-step management decisions using real ER data and hundreds of physician comparisons. Researchers say the results call for collaborative care models, not AI replacement of doctors.
OpenAI says ChatGPT is already being used at research scale across science and mathematics. In its January 2026 report, the company says advanced science and math usage reached nearly 8.4 million weekly messages from roughly 1.3 million weekly users, with early evidence that GPT-5.2 is contributing to serious mathematical work.
HN read this math story less as another "AI did it" headline and more as a case where a model pointed at a route humans had not tried. The part that stuck was the expert cleanup work after the GPT-5.4 Pro draft, not the one-shot prompt itself.
Comments (0)
No comments yet. Be the first to comment!