METR follow-up: from “20% slowdown” to possible AI speedup for expert developers

Original: METR follows up on often cited study from last year on 20% developer slowdown in specific experiment, finds speedup now likely, but other interesting findings as well View original →

Read in other languages: 한국어日本語
LLM Feb 25, 2026 By Insights AI (Reddit) 2 min read 1 views Source

A r/singularity thread is circulating METR’s latest note, uplift update, on how AI tools affect software productivity. The topic matters because METR’s earlier study became a common reference for the claim that AI assistance could slow down experienced developers in realistic open-source work.

What METR now reports

METR reiterates the original finding: in data from February to June 2025, experienced open-source developers were around 20% slower on study tasks when AI tools were allowed. In the follow-up, the team says that late-2025/early-2026 conditions look different. Broader adoption of agentic tools such as Claude Code and Codex appears to have changed developer behavior and task selection.

The update includes raw estimates that move toward speedup: for returning developers, METR reports an estimated -18% effect (confidence interval -38% to +9%); for newly recruited developers, an estimated -4% effect (confidence interval -15% to +9%). In their framing, negative values indicate speedup. That shift is directionally important, but METR repeatedly warns that this signal is noisy.

Why uncertainty remains high

  • Selection effects increased: surveyed developers reported skipping tasks they did not want to do without AI (30% to 50%).
  • Recruitment incentives changed: pay dropped from $150/hour in the original study to $50/hour in the follow-up.
  • Measurement became harder: multi-agent workflows made time tracking less reliable.

METR’s interpretation is nuanced: productivity uplift is now likely higher than in early 2025, but current experimental design may systematically miss the tasks and developers with the biggest AI gains. So this is not a clean “AI is definitely X% faster” conclusion. It is a methodological warning plus a directional update.

For engineering teams, the practical takeaway is to instrument productivity locally instead of importing a single external headline number. Segment by task type, tool stack, review quality, and completion criteria. The Reddit discussion is useful because it surfaces that measurement design is now as important as model capability when evaluating AI coding impact.

Share:

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.