ChatGPT health answers cut flagged factual issues by 71%

Health is one of the highest-stakes places people use ChatGPT, and OpenAI is now putting numbers on how much that surface has changed. The company says more than 230 million people each week ask ChatGPT for help with health and wellness questions, from lab results to appointment preparation and insurance navigation.

The sharpest figure is a 71% decline. OpenAI says privacy-preserving monitors on recent production traffic found that the rate of health responses with at least one flagged factuality issue fell by 71% over the last two months. That matters because the claim is not limited to a lab demo; it is tied to health traffic measured across billions of messages a week.

The model behind the update is GPT-5.5 Instant, which OpenAI says is available to free ChatGPT users subject to limits. In its health-specific evaluations, including HealthBench and HealthBench Professional, GPT-5.5 Instant reached performance comparable to the company’s latest frontier Thinking models and improved substantially over GPT-5.3 Instant.

OpenAI also used physician comparison studies. Doctors wrote responses to representative health conversations with unlimited time and internet access, while a separate physician panel reviewed 3,500 responses across accuracy, communication, completeness, instruction following, and usefulness for health decisions. OpenAI says GPT-5.5 Instant responses were rated higher than physician-written and older model responses across those criteria.

The practical story is not that ChatGPT becomes a doctor. The improvement OpenAI is describing is narrower and more important: better recognition of urgent-care signals, more relevant follow-up questions, clearer uncertainty, and fewer unsupported claims. For consumer health AI, the bar is shifting from fluent medical language to measurable reductions in risky confidence.

ChatGPT health answers cut flagged factual issues by 71%

Related Articles

Skin AI test with 2,345 users shifts focus from labels to next steps

GPT-5.4 chemistry work moves from literature review to lab validation

LifeSciBench turns 750 expert biology tasks into an AI test bed