peopleanalyst

Research substrate

Insight Cards

Atomic quantitative findings from the research underlying the magazine and the AI Human Interaction Guide. Each card carries a single headline finding, full source attribution, methodology, and framing claims. Cards cite into longer editorial work by ID.

gen-aiQ5to verify

Kazemitabaar et al. — 10-session AI-coding study with one-week retention (no short-term skill decrement)

A repeated-measures study of student programmers across 10 sessions with AI assistance, including a one-week retention check, found no statistically significant short-term decrement in manual code-modification ability or one-week retention compared to baseline — directly cutting against the strongest 'immediate AI deskilling' alarms while leaving long-run effects unmeasured.

Manual code-modification accuracy + one-week retention test, comparing AI-assisted-learning condition vs baselineNo statistically significant short-term decrement in manual code modification or one-week retention. Specific effect-size numbers not extracted to verification.
Sample
Student programmers across 10 instructional sessions; exact N not extracted to verification.
Methodology
Repeated-measures within-subject design across 10 sessions + retention probe one week post-intervention. Among the cleanest short-repeated-measures designs in the AI-coding literature per the AHI review.

What this means

  • Important null/negative result that constrains the 'AI immediately deskills' narrative — short-term substitution + reduced frustration do not measurably erode one-week retention.
  • Highlights the *measurement gap* rather than settling the deskilling question: 10 sessions + one-week retention is short by panel-study standards; the long-run trajectory remains untested.
  • Pairs with Bassner et al. (better scores but same learning), Stray et al. (no Copilot effect on commit activity), and 3-year classroom study (stable grades despite prompt-behavior shift) as the 'null cluster' against which deskilling claims must be evaluated.

Source

(Title to verify — 10-session AI-coding learning study with retention probe)

arXiv preprint (referenced as a load-bearing student-repeated-measures design in AHI longitudinal-cognitive-effects review) · Majeed Kazemitabaar & et al. · 2023 · peer-reviewed

Context

What came before
Public discourse on AI coding tools (2023-2024) often framed deskilling as an imminent, well-evidenced risk. The Kazemitabaar null is one of the cleanest data points cutting against that framing.
What comes next
Verify exact N, exact retention-test instrument, and whether retention was tested at intervals longer than one week. Connect to METR 2025 finding (experienced devs slower on familiar repos with AI) — together they triangulate the 'effects depend on expertise + horizon' picture.
Where this lands
Encyclopedia Part I §1.3 (methodology gap) — used to honestly bound the deskilling claim; Part V (research frontier — what we don't yet know).
gen-aiQ6to verify

Lee et al. 2025 (Microsoft Research) — GenAI confidence inversely predicts critical thinking effort in knowledge work

In a survey of 319 knowledge workers describing 936 GenAI-assisted work tasks, higher self-reported confidence in the GenAI tool predicted less critical thinking effort, while higher self-reported self-confidence predicted more critical thinking. Qualitatively, GenAI reallocated critical effort away from direct task execution and toward verification, response integration, and stewardship of machine output.

Self-reported critical thinking effort regressed on (a) confidence in GenAI tool, (b) self-confidence — across knowledge-worker tasksHigher confidence-in-GenAI → less critical thinking; higher self-confidence → more critical thinking. Exact regression coefficients / effect sizes not extracted to verification.
Sample
N = 319 knowledge workers describing 936 GenAI-assisted tasks
Methodology
Survey + qualitative coding of free-text task descriptions; mixed-methods analysis of the confidence-vs-effort relationship.

What this means

  • The strongest non-programming empirical anchor for the 'cognitive redistribution, not deskilling' synthesis: AI does not remove cognitive effort, it redirects it toward verification + integration + stewardship.
  • Maps directly onto the programming-specific findings (Prather et al. — illusion of competence; Shihab et al. — brownfield shift to prompt-view-implement; Qiao et al. — performance improvement without comprehension gain).
  • The confidence-direction effect (trust in tool reduces own effort; trust in self increases it) is a measurable calibration variable that any 6-24 month panel study must instrument.

Source

The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects from a Survey of Knowledge Workers

Microsoft Research (working paper) · Hao-Ping (Hank) Lee & and Microsoft Research / collaborator team · 2025-01 · peer-reviewed

Context

What came before
Pre-2025 GenAI productivity literature focused on completion-time deltas and self-reported satisfaction; explicit measurement of cognitive-effort redistribution was rare.
What comes next
Verify exact regression coefficients in the primary source. Extend to the AHI Part V research-frontier discussion of calibration failure modes. Pair with the programming-specific Prather / Shihab / Qiao findings.
Where this lands
Encyclopedia Part I §1.3 (methodology gap — cognitive redistribution); Part II (workforce — how AI changes knowledge work); Part V (research frontier — calibration failure modes).
gen-aiQ7to verify

Shumailov et al. 2024 — model collapse from recursively generated data (Nature)

Generative AI models trained on data that includes their own previous outputs progressively forget the true data distribution over generations — in particular, low-probability ('tail') events disappear first, and after enough iterations the model converges on a degenerate distribution with little resemblance to the original.

Distribution distance from original training corpus across model generations under recursive self-training (perplexity drift; loss of distributional tails)Tails of the data distribution are lost within a handful of generations; convergence to a degenerate distribution is theoretically inevitable in the recursive-self-training regime. Specific numerical values for perplexity drift were not extracted to verification; see provenance.
Sample
Simulation across multiple model families (Gaussian mixture models, variational autoencoders, large language models) with iterative self-training cycles. Specific N of iterations / models not extracted to verification.
Methodology
Theoretical analysis plus empirical demonstration of recursive-training degeneration across multiple model families; trained successive generations of models on data sampled from prior model generations and measured distributional drift.

What this means

  • The 'model collapse' phenomenon is the digital-ecological analog of niche-construction-induced variance collapse: the AI's outputs become its own training environment, and the loop systematically erodes diversity.
  • Implies that uncontrolled use of LLM-generated web content as future training data creates a feedback loop that caps the intelligence of future models at the level of the current model.
  • Provides a load-bearing mechanism for the encyclopedia's Part I §1.3 'methodology gap' — software engineering and knowledge work that uses AI outputs without provenance discipline is a model-collapse-like substrate for the human-AI system.

Source

AI models collapse when trained on recursively generated data

Nature · Ilia Shumailov et al. · 2024-07-24 · peer-reviewed

Context

What came before
Pre-2024 LLM training discourse treated web-scale text as an essentially infinite, externally-sourced training substrate. The implicit assumption was that successive model generations could continue scaling on more of the same kind of data.
What comes next
Verification of the specific numerical drift rates (iterations to tail loss, perplexity-curve shapes). Comparison with Cito & Bork 2025 'code collapse' analogue for software ecosystems. Empirical work on whether commercial provider data-filtering pipelines (e.g., anti-AI-detection in training data curation) actually prevent the collapse trajectory.
Where this lands
Encyclopedia Part I §1.3 (methodology gap / why this isn't software-as-usual) and Part V (research frontier — feedback-loop measurement).
← AI Human Interaction Guide