Harvard Study: OpenAI's o1 Diagnoses 67% of ER Patients Correctly, Outperforming Doctors
Original: OpenAI's o1 correctly diagnosed 67% of ER patients vs. 50-55% by triage doctors View original →
The Trial
Harvard Medical School researchers ran a clinical study at a Boston emergency room, pitting OpenAI's o1 reasoning model against human physicians on identical patient data. Each side received the same electronic health records — vital signs, demographics, and a brief triage note — and was asked to produce a diagnosis for 76 patients.
Key Numbers
- Basic triage data: AI 67% accuracy vs. doctors 50-55%
- Full clinical data: AI 82% vs. doctors 70-79%
- Long-term treatment planning (5 case studies): AI 89% vs. doctors 34%
The AI's edge was most pronounced in rapid-decision triage scenarios with minimal information. In one case, while human doctors concluded anti-coagulants were failing in a blood-clot patient, the AI spotted the patient's lupus history and correctly identified lung inflammation as the actual cause.
Triadic Care, Not Replacement
Lead author Arjun Manrai, who heads an AI lab at Harvard Medical School, was explicit: "I don't think our findings mean that AI replaces doctors." The study only evaluated text-based data; patient appearance and physical cues were not part of the assessment — making the AI's role closer to a second opinion on paperwork.
Co-author Dr Adam Rodman described LLMs as among the "most impactful technologies in decades" and predicted that within ten years, medicine will shift to a "triadic care model" — doctor, patient, and AI working together.
Adoption Is Underway
Nearly one in five US physicians already use AI for diagnostic assistance. In the UK, 16% of doctors use AI daily and 15% weekly, with clinical decision-making as a primary use case. Top concerns remain AI error and liability.
Related Articles
Why it matters: OpenAI is targeting a regulated workflow where accuracy claims carry direct clinical consequences. The linked rollout cites 6,924 physician-reviewed conversations and a 99.6% safe/accurate rating in internal review.
OpenAI is moving from generic chat to a healthcare-specific workspace, and the timing is clear: 72% of physicians now report AI use in clinical practice. The new product is free to verified U.S. physicians, NPs, PAs, and pharmacists, and OpenAI says doctors rated 99.6% of tested responses safe and accurate across 6,924 conversations.
One of AI’s most important commercial contracts just loosened up. Microsoft keeps Azure’s first-stop role and long-dated IP access, but OpenAI can now sell across any cloud and Microsoft will no longer pay it a revenue share.
Comments (0)
No comments yet. Be the first to comment!