peopleanalyst

research / ai-human-interaction / overview

AI–Human Interaction — overview

What AI does to human capability over time. Penwright is the lead empirical apparatus; the broader frame extends to AI as long-term cognitive partner across professions and domains.

AI–Human Interaction·Overview

AI–Human Interaction — overview

A research program on AI–human interaction at long timescales, larger units of analysis, and theoretical depth — the three places the existing literature is thinnest. The program treats Penwright (an authorship-development system shipped inside Vela) as the lead apparatus for empirical work on AI-augmented writing, while keeping the broader frame general: AI as a long-term cognitive partner across professions, domains, and life stages.

Working draft · v0.1 · 2026-05-03 · For posting at peopleanalyst.com/research/ai-human-interaction


The question

Existing AI-human-interaction (HAI) research clusters in single-session, individual-level, descriptive studies. We know remarkably little about what happens to a person's reasoning, vocabulary, social life, or skill acquisition over months and years of daily interaction with capable AI systems. We know less still about multi-party configurations (groups of humans and multiple agents), about long-context emergence (drift, sycophancy spirals, evolving rapport), and about cross-cultural and sociotechnical embedding.

This program asks a specific version of that gap: what does AI do to human capability over time, and what kinds of system design support development rather than dependence?

Why now

Three things have changed simultaneously, and the intersection is under-studied:

  1. Capable conversational AI is now a daily-use technology for tens of millions of people doing knowledge work — writing, programming, research, design. The single-session lab study is no longer the relevant unit of analysis.
  2. The "operator-of-multiple-agents" is a new role without a corresponding body of human-factors literature. Cockpit HCI studied a pilot supervising one or two automation systems; this is one human supervising N heterogeneous agents with overlapping authority.
  3. The economic argument for AI tools assumes coordination cost and capability erosion are small. If the Ironies of Automation (Bainbridge 1983) generalize — operator vigilance falling as system reliability rises — then the productivity story being told today systematically overstates net effect.

The portable contribution

We are not building a benchmark, evaluating any specific model, or advocating any specific tool. The contribution is methodological and empirical:

  • A taxonomy of the existing HAI field — twelve overlapping branches, with edge-zones where the frontier sits (longitudinal interaction, multi-party configurations, long-context emergence, calibration of personalization, cross-cultural variation, sociotechnical embedding, failure-and-recovery dynamics).
  • A theoretical bridge between the HAI literature and the bodies of theory it has under-engaged: companion-species studies, cognitive apprenticeship, translation theory, working-alliance research, distributed cognition, niche construction, communication accommodation, improv pedagogy, institutional economics, phenomenology of skill, transactive memory, religious and ritual studies — among others.
  • A measurement framework for AI-augmented skill development — the Penwright Measurement Framework (six skill dimensions, six derived indices, three measurement layers, five-step learning loop) — with explicit failure modes the framework is engineered to avoid.
  • A pre-registered empirical program — the Penwright Research Program — twelve papers across three tiers (foundational theory · measurement and mechanism · longitudinal empirical studies), drawing from a shared dataset generated by the Penwright system in production.

The methods generalize beyond writing. The same shape — longitudinal capability tracking, structured interaction protocols, transparent corpus control, learning-loop scaffolding — applies to AI in coding, design, research, education, clinical practice, and any domain where the question is "is the human getting better, or is the system substituting for capability that's quietly atrophying?"

The discipline

This program is structured to avoid the failure modes of vendor-funded "research":

  • Pre-registered predictions before data collection
  • Falsifiable, operationalized constructs — "skill development" decomposed into measurable indices before being invoked
  • Acknowledged researcher position — the principal investigator is also the operator of the system being studied; this is auto-ethnography for descriptive work and an explicit threat-to-validity for causal claims, mitigated through external operators where claims require generalization
  • Genre-aware analysis — memoir, nonfiction, and fiction are not collapsed; each has different risk profiles and different developmental dynamics
  • No effect-size-free conclusions, no vendor comparisons, no LLM-internals claims

Why this matters

If AI tools behave like cockpit automation in the regime we care about — and if the structural shift is replacing thinking rather than supporting thinking — the productivity story being told about AI is missing a major term. Improvements to model quality compound only to the extent that the coupled human-machine system actually capitalizes on them, and four decades of automation research suggests it often doesn't.

But the alternative is real and tractable: systems can be designed to make people more capable, not less. The point of this program is to find out — empirically, longitudinally, with measurement that can fail — whether and under what conditions that happens.

Read more