Kazemitabaar et al. — 10-session AI-coding study with one-week retention (no short-term skill decrement)
A repeated-measures study of student programmers across 10 sessions with AI assistance, including a one-week retention check, found no statistically significant short-term decrement in manual code-modification ability or one-week retention compared to baseline — directly cutting against the strongest 'immediate AI deskilling' alarms while leaving long-run effects unmeasured.
- Sample
- Student programmers across 10 instructional sessions; exact N not extracted to verification.
- Methodology
- Repeated-measures within-subject design across 10 sessions + retention probe one week post-intervention. Among the cleanest short-repeated-measures designs in the AI-coding literature per the AHI review.
What this means
- Important null/negative result that constrains the 'AI immediately deskills' narrative — short-term substitution + reduced frustration do not measurably erode one-week retention.
- Highlights the *measurement gap* rather than settling the deskilling question: 10 sessions + one-week retention is short by panel-study standards; the long-run trajectory remains untested.
- Pairs with Bassner et al. (better scores but same learning), Stray et al. (no Copilot effect on commit activity), and 3-year classroom study (stable grades despite prompt-behavior shift) as the 'null cluster' against which deskilling claims must be evaluated.
Source
(Title to verify — 10-session AI-coding learning study with retention probe)
arXiv preprint (referenced as a load-bearing student-repeated-measures design in AHI longitudinal-cognitive-effects review) · Majeed Kazemitabaar & et al. · 2023 · peer-reviewed
Context
- What came before
- Public discourse on AI coding tools (2023-2024) often framed deskilling as an imminent, well-evidenced risk. The Kazemitabaar null is one of the cleanest data points cutting against that framing.
- What comes next
- Verify exact N, exact retention-test instrument, and whether retention was tested at intervals longer than one week. Connect to METR 2025 finding (experienced devs slower on familiar repos with AI) — together they triangulate the 'effects depend on expertise + horizon' picture.
- Where this lands
- Encyclopedia Part I §1.3 (methodology gap) — used to honestly bound the deskilling claim; Part V (research frontier — what we don't yet know).