agentsQ5to verify
Cito & Bork 2025 — the 'polluted well' / code-collapse argument for software ecosystems (arXiv)
LLM-generated code, often containing subtle bugs or stylistic biases, is being committed to public repositories and then used as training data for the next generation of code models — creating a recursive loop that, over time, narrows code diversity, loses optimized 'tail' solutions, and converges open-source ecosystems on bland, vulnerable patterns. The authors warn that 'replacing the human engineer caps the intelligence of the software ecosystem at the level of the current model... turn[ing] engineering into a closed loop.'
Trajectory of code-corpus diversity (entropy of idioms, tail solution frequency, novelty rate) under iterative LLM-generation → public-repo commit → next-generation trainingQualitative trajectory: narrowing variance, tail loss, path dependence — same shape as Shumailov et al. model-collapse trajectory but in code substrate. Specific numerical metrics from this paper not extracted to verification.
- Sample
- Analytical / model-based argument; the AHI review describes it as a 'theoretical model' rather than reporting empirical N. Empirical-N status to verify.
- Methodology
- Theoretical / model-based analysis of the recursive-training dynamic specific to software ecosystems where AI outputs persist as training data through public-repository commits.
What this means
- Code-collapse is the software-ecosystem analog of Shumailov et al.'s model collapse — the same niche-construction-loss-of-tails mechanism, applied to the substrate of public source code.
- Implies a governance gap: existing open-source norms (Linus's-Law-style 'many eyeballs make bugs shallow') were calibrated for a substrate of human contributions, not for a substrate where the contribution pipeline is mediated by LLMs.
- Pairs with the institutional-economics finding that AI shifts the locus of cost from production to governance — the polluted-well case is the specific shape governance must now cover.
Source
arXiv · Cito & Bork · 2025 · peer-reviewed
Context
- What came before
- GitHub Copilot adoption studies (Song et al. 2024 +5.9% OSS contributions; Microsoft Research Copilot productivity work) reported first-order productivity wins without measuring the substrate-level recursion. The code-collapse argument is the second-order critique.
- What comes next
- Verify whether the Cito & Bork paper reports empirical metrics or is a theoretical-model-only contribution. Look for empirical replication / partial replication in the OSS-telemetry literature. Connect to METR 2025 finding that experienced developers on familiar repos are slower with AI tools — possibly a leading indicator of substrate-quality degradation.
- Where this lands
- Encyclopedia Part I §1.3 (methodology gap), Part IV (product/operations — agentic coding), Part V (research frontier).