peopleanalyst

Insight Cards · agents

agentsQ6to verify

METR 2025 — experienced open-source developers on familiar large repos are slower with AI coding tools than without

In a 2025 study by METR, experienced open-source developers working on large repositories they knew intimately were measurably slower completing tasks with AI coding tools than without — directly inverting the canonical 'AI makes developers faster' assumption in the high-expertise + high-context-specificity regime.

Task-completion time with-AI vs without-AI for experienced developers on familiar large open-source repositoriesExperienced developers were *slower* with AI tools (sign reversed from the controlled-task benchmark). Exact magnitude not extracted to verification.
Sample
Experienced developer cohort; exact N not extracted to verification.
Methodology
Within-subject or treatment/control study of experienced developers on large familiar repositories, with/without AI coding tools.

What this means

  • The single result that most cleanly inverts the Peng et al. 55.8% benchmark — establishes that the 'AI helps' generalization breaks down in the high-expertise + high-context-specificity regime that describes most production engineering work.
  • Maps directly onto the AHI institutional-economics reading: when asset specificity (here, repo-specific tacit knowledge) is high, AI generation does not compose well with the verification + integration work that production code requires.
  • Critical counter-evidence for the encyclopedia's Part I §1.3 honesty register — without this, the methodology-gap argument leans too heavily on the controlled-task literature.

Source

(2025 study; exact title and URL to verify — referenced in AHI institutional-economics topic review)

METR (Model Evaluation & Threat Research) · METR research team · 2025 · peer-reviewed

Context

What came before
The 55.8% Copilot speedup (Peng et al. 2023) and the +14% NBER customer-support gain (Brynjolfsson et al. 2023) had established a 'AI substantially raises productivity' narrative. METR 2025 directly inverts the sign for the experienced-developer-on-familiar-repo case.
What comes next
Verify METR's exact title, URL, N, and effect-size estimate (this is the AHI review citation [19], but the precise publication is not given in the review's bibliography). Connect to the Stray two-year Copilot null and the AHI longitudinal-cognitive-effects review's 'measurement instrument matters' synthesis.
Where this lands
Encyclopedia Part I §1.3 (methodology gap), Part IV (product/operations — agentic coding limitations), Part V (research frontier — sign-inversion findings).
← All insight cards