Sciences Feb 16, 2026 2 min read
OpenAI says it ran more than one million synthetic evaluations across papers from 160+ political science journals to prioritize replication efforts. The workflow uses model-vs-observed disagreement to identify studies most worth re-testing.