What is PeopleAnalyst?

PeopleAnalyst is the front door for people-analytics research: 205+ works indexed and profiled, 40+ citation-grade findings extracted, and peer-reviewed behavioral science translated from academic to actionable — the missing manual for the people analytics you always meant to do.

What is people analytics?

People analytics is not a dashboard. It is behavioral science and statistical inference applied to workforce decisions — a discipline with its own methodology, spanning measurement, organizational design, talent, leadership, and analytics craft.

Why does AI in HR need measurement science?

AI is being deployed in high-stakes people decisions — hiring, performance, attrition — without the measurement science to evaluate whether it works or whom it harms. Construct validity, effect sizes, and criterion validity are the vocabulary for asking an AI vendor the right questions.

How is the research made accessible?

The evidence is indexed and searchable: 205+ works, 40+ citation-grade insight cards, and 8 research arcs, so the right finding reaches the right decision at the right time.

What separates good people measurement from assertion?

Good measurement has a method: construct validity, reliability, and effect-size interpretation are not optional — they are what separates evidence from assertion.

research / namesake / bibliography

Literature map

Field positioning across the cultural-diffusion literature.

By Mike West

Namesake·Bibliography·source: people-analyst/baby-namer/docs/research/literature-map.md

Literature Map — Namesake Research Program (Cultural Diffusion)

Status: First merged draft Date: 2026-04-29 Scope: every cited or method-naming work that appears in docs/research/reports/ and the master spec docs/research/PHD_STUDY_SPEC.md. Each section names the construct or method, the Namesake measure, the canonical source, and Namesake's divergence or extension. Companion bibliography.bib carries BibTeX with verified DOIs. Method: Citations harvested by walking every report under docs/research/reports/ and the methods rows of PHD_STUDY_SPEC.md §1, then cross-referenced against shipped artifacts (null_model_summary.json, granger_results.parquet, event_panel.parquet, etc.). Citation style: APA 7 inline; BibTeX keys lastname_year_firstword in bibliography.bib. DOI policy: DOIs and URLs included only when verified to resolve. Public-dataset references appear in their own section.

Column legend:

Coordinate: the construct, method, or pipeline component being positioned.
Namesake measure: the concrete column, parquet artifact, or scoring step.
Primary source (APA 7): the canonical citation.
DOI / URL: verified publisher or stable identifier.
Relevance: why this source matters for the Namesake claim (≤40 words).
Namesake divergence or extension: what we do differently from the cited work.

A. The Lieberson neutral-fashion debate (`PHD_STUDY_SPEC` §1, Phase 5)

The whole research program is structured around a falsifiable contest: do cultural events drive naming, or does most of the observed turnover fall out of a neutral-copying ("internal fashion") process? Phase 5's null model is the referee.

Coordinate	Namesake measure	Primary source (APA)	DOI / URL	Relevance	Namesake divergence
Internal-fashion / "matter of taste" theory	Phase 5 null model; bucket field on `name_cultural_events` × `null_model_thresholds.parquet` (`drift-consistent`, `partially-cultural`, `culturally-influenced`, `strongly-cultural`)	Lieberson, S. (2000). A matter of taste: How names, fashions, and culture change. Yale University Press.	Yale UP	Origin of the claim that name churn arises predominantly from endogenous taste cycling rather than discrete cultural events. The ceiling our pipeline tries to break.	We attribute 843 specific cultural events with a confidence-thresholded LLM pipeline and benchmark each against the same neutral simulation, so the contrast becomes per-event rather than population-level.
Neutral drift / random copying applied to baby names	Innovation rate $\mu = 0.0505$ and effective population sizes ($N_e^F = 9{,}852$, $N_e^M = 22{,}320$) in `null_model_summary.json`	Hahn, M. W., & Bentley, R. A. (2003). Drift as a mechanism for cultural change: an example from baby names. Proceedings of the Royal Society B, 270(Suppl. 1), S120–S123.	10.1098/rsbl.2003.0045	Establishes the Wright–Fisher analogy for SSA-grade naming data and the magnitude of drift expected with realistic $N_e$.	We carry the framework forward to 2024, refit per-sex $N_e$ on the modern panel, and report the share of name-years that beat the null at p95 (4.57%) and p99 (2.83%).
Random-copying as a general cultural-change mechanism	Phase 5 phonetic-fashion null in `fit_phonetic_fashion.py`; `beats_phonetic_p99` flag (5.27% of name-years at p95)	Bentley, R. A., Lipo, C. P., Herzog, H. A., & Hahn, M. W. (2007). Regular rates of popular culture change reflect random copying. Evolution and Human Behavior, 28(3), 151–158.	10.1016/j.evolhumbehav.2006.10.002	Generalizes the drift result to dog breeds, pottery, and other cultural domains, predicting power-law turnover where random copying dominates.	The phonetic-fashion null incorporates Berger-style sound-neighborhood pull, so divergence between drift and phonetic nulls quantifies how much structure sits in sound communities versus generic copying.
Wright–Fisher population genetics	Posterior $N_e$ fits used to parameterize Phase 5 simulation	Wright, S. (1931). Evolution in Mendelian populations. Genetics, 16(2), 97–159.	10.1093/genetics/16.2.97	Foundational drift-only model the cultural application borrows from.	We use it as a null, not as a description; cultural innovation rate $\mu$ is not biological mutation.
Genetical theory of natural selection	Same	Fisher, R. A. (1930). The genetical theory of natural selection. Clarendon Press.	Internet Archive	Co-foundational source for the drift framework Phase 5 inherits via Hahn & Bentley.	We extend the analogy only as far as turnover statistics; do not import "fitness" as a per-name parameter.
Neutral theory of molecular evolution	Conceptual analogue for the "most change is drift" framing in `cultural-diffusion.md`	Kimura, M. (1968). Evolutionary rate at the molecular level. Nature, 217(5129), 624–626.	10.1038/217624a0	Provides the rhetorical template: most observed change is consistent with the null, with selection visible only in the tails.	Our findings reverse Kimura's emphasis: the modal name year is null-consistent, but the tail (4.57% at p95) is large enough that population-level claims of "drift dominates" understate culture's role.

B. Phonetic neighborhoods and sound-driven fashion (`PHD_STUDY_SPEC` §1, Phases 3 + 6)

The single sentence the cultural-diffusion essay keeps returning to — names do not travel alone; they travel in sound families — is operationalized as a phoneme-edit-distance graph.

Coordinate	Namesake measure	Primary source (APA)	DOI / URL	Relevance	Namesake divergence
Phonetic neighborhood as the unit of cultural diffusion in baby names	`phonetic_spillover_results.parquet` (166,046 phonetic pairs); `phonetic_clusters.parquet` (1,221 clusters); Phase 6 Welch t = 7.21 (p ≈ 1.1 × 10⁻¹²) for phonetic vs random pairs	Berger, J., Bradlow, E. T., Braunstein, A., & Zhang, Y. (2012). From Karen to Katie: Using baby names to understand cultural evolution. Psychological Science, 23(10), 1067–1073.	10.1177/0956797612443371	Establishes that phonetic neighbors co-vary in popularity at rates substantially above chance — the empirical foundation for Namesake's spillover analysis.	Berger inferred phonemes from spelling heuristics on a smaller window; we use the CMU Pronouncing Dictionary plus a g2p fallback across the full SSA universe (1880–2024) and persist the neighborhood graph as a queryable artifact.
ARPAbet phoneme strings for English given names	`research_name_phonemes` table; phoneme/sonority/onset-rhyme features in `data/research/derived/phonetic_features.parquet`	Carnegie Mellon University. (2014). The CMU Pronouncing Dictionary, version 0.7b [Software].	CMU	Provides ARPAbet for ~134K English words including a large fraction of attested first names; the substrate for the Phase 3 graph.	We backfill out-of-vocabulary names with a g2p model rather than dropping them, which matters because the most "culturally interesting" recent names (Khaleesi, Wrenley) are exactly the ones missing from CMU.

C. Cultural diffusion theory (`PHD_STUDY_SPEC` §1, Phase 7c)

The Bass model and the structural-virality framework are the two diffusion theories Namesake actually fits, not just cites.

Coordinate	Namesake measure	Primary source (APA)	DOI / URL	Relevance	Namesake divergence
Innovation/imitation decomposition of adoption curves	Phase 7c `bass_params.parquet` (60,470 fits, R² median 0.992): `p`, `q`, classification ∈ {broadcast 26.1%, peer 42.7%, mixed 26.8%, unfit 4.4%}	Bass, F. M. (1969). A new product growth for model consumer durables. Management Science, 15(5), 215–227.	10.1287/mnsc.15.5.215	Provides the (p, q) decomposition that lets us separate broadcast-driven from peer-driven names — the key behavioral prediction of the cultural-events theory.	Bass was about durable goods with a fixed market potential m; we fit on adoption-curve segments with rolling segmentation, accept that m is partly identified through the SSA registration ceiling, and report stratified medians by event_type rather than firm-level dynamics.
Diffusion of innovations as a general framework	Background motivation for Phase 7c segmentation choices and the "innovators / early adopters / majority" framing in the cultural-diffusion essay	Rogers, E. M. (2003). Diffusion of innovations (5th ed.). Free Press.	Free Press	Defines the conceptual vocabulary (innovator, imitator, S-curve) we operationalize via Bass and the spike-attribution panel.	Rogers is qualitative; we instantiate his typology as the Bass classification rule (`broadcast` vs `peer`) on 60K fitted curves.
Broadcast vs viral cascades	Conceptual frame for the side-quest panel on cascade structure (`docs/research/reports/side_quests.md`) and the Bass-classification stratification	Goel, S., Anderson, A., Hofman, J., & Watts, D. J. (2016). The structural virality of online diffusion. Management Science, 62(1), 180–196.	10.1287/mnsc.2015.2158	Formalizes the broadcast↔viral spectrum that Bass collapses into a single (p, q) ratio; constrains how to interpret peer-dominant names.	We apply the broadcast/viral framing to names rather than to URL cascades; the analogue of "tree depth" is the Phase 6 spillover ratio onto phonetic cousins.

D. Self-exciting processes and contagion dynamics (`PHD_STUDY_SPEC` §1, Phase 7b)

Hawkes processes give us a continuous-time complement to the annual-grain Granger work.

Coordinate	Namesake measure	Primary source (APA)	DOI / URL	Relevance	Namesake divergence
Self-exciting point processes; branching ratio as a contagion measure	Phase 7b `hawkes_params.parquet` (6,328 fits, 6,033 stable): branching ratio (median 0.241), half-life weeks (median 1.39), background rate $\mu$ (median 0.064), excitation $\alpha$, decay $\beta$	Hawkes, A. G. (1971). Spectra of some self-exciting and mutually exciting point processes. Biometrika, 58(1), 83–90.	10.1093/biomet/58.1.83	The canonical statement of the model; defines branching ratio < 1 as the stability boundary that 95.3% of fits satisfy.	We fit Hawkes on weekly Google Trends spike series rather than on births directly; the half-life captures the memory of cultural attention, not of the SSA file. The mismatch is intentional — short-horizon search excitation is exactly what the Granger annual aggregation cannot resolve (limitation §3 of the Phase 7a report).

E. Time-series causal inference — Granger causality and panel VAR (`PHD_STUDY_SPEC` §1, Phase 7a)

Coordinate	Namesake measure	Primary source (APA)	DOI / URL	Relevance	Namesake divergence
Granger causality between paired time series	Phase 7a `granger_results.parquet` (4,185 names, 21.5% significant at p<0.05 raw, 1.0% under BH at α=0.05; median optimal lag 1 year); validation canaries (Elsa, Arya, James, Khaleesi)	Granger, C. W. J. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37(3), 424–438.	10.2307/1912791	Defines the predictive-causality test that anchors the search→births claim.	The Khaleesi non-result identifies a class of within-year cultural shocks Granger cannot detect by construction — Phase 7b Hawkes is the explicit complement.
Panel vector autoregression and impulse-response analysis	Within-name demeaned panel VAR(p=3) on 4,206 names; pooled IRF +0.0063 at h=1 with 95% CI [+0.0045, +0.0081]; subsample IRFs reveal aspirational (+) vs cautionary (−) split	Sims, C. A. (1980). Macroeconomics and reality. Econometrica, 48(1), 1–48.	10.2307/1912017	Foundational specification for VAR/IRF analysis applied across the stacked panel; the orthogonalized IRF is what isolates the response of births to a one-SD search shock.	We apply Sims' macro tooling to a name-level cultural panel; the most consequential finding (negative IRF for `event_news_event` at h=2: −0.0226, 95% CI [−0.033, −0.009]) is a within-name fixed-effects result, not a cross-name comparison.

F. Synthetic-control causal inference (`PHD_STUDY_SPEC` §1, Phase 8)

Coordinate	Namesake measure	Primary source (APA)	DOI / URL	Relevance	Namesake divergence
Synthetic-control method (origination)	Donor-pool construction in `phase8_causal/synthetic_controls.py`; placebo distribution on 1,000 non-spiking names	Abadie, A., & Gardeazabal, J. (2003). The economic costs of conflict: A case study of the Basque Country. American Economic Review, 93(1), 113–132.	10.1257/000282803321455188	Introduces the convex-weights matching that lets us build a counterfactual for a single treated unit (here, a name with a cultural spike).	We treat 200 events instead of one and do per-event placebo inference rather than narrative comparison.
Synthetic-control method (formalization, placebo inference)	`research_event_ate` ATE_t1..ATE_t5 with placebo p-value; covariate set in `PHD_STUDY_SPEC` §4 (`is_lead_character`, `is_title_character`, `actor_popularity`, `budget`, `revenue`, `genres`)	Abadie, A., Diamond, A., & Hainmueller, J. (2010). Synthetic control methods for comparative case studies: Estimating the effect of California's tobacco control program. Journal of the American Statistical Association, 105(490), 493–505.	10.1198/jasa.2009.ap08746	Provides the MSPE-minimizing donor weights and the placebo-test framework for empirical p-values that we use directly.	The donor pool is matched on phonetic-neighborhood density (not just covariates), an extension that follows from Phase 6 spillover.

G. Spatial autocorrelation (`PHD_STUDY_SPEC` §1, Phase 10a)

Coordinate	Namesake measure	Primary source (APA)	DOI / URL	Relevance	Namesake divergence
Moran's I for spatial autocorrelation	Decade-by-decade Moran's I in `moran_report.md` (1960s: 0.515 → 2020s: 0.268); pre-streaming mean 0.429 vs post-streaming 0.291	Moran, P. A. P. (1950). Notes on continuous stochastic phenomena. Biometrika, 37(1/2), 17–23.	10.1093/biomet/37.1-2.17	Defines the I statistic that quantifies whether name popularity is geographically clustered — the empirical lever for the streaming-era homogenization claim.	We compute Moran's I on the SSA state file (`ssa_state_year.parquet`) rather than on contiguous neighborhoods; spatial weights are state-adjacency, and the trend (decreasing I) is read as evidence of streaming-era national-exposure flattening.

H. Predictability ceiling (`PHD_STUDY_SPEC` §1, Phase 10b)

Coordinate	Namesake measure	Primary source (APA)	DOI / URL	Relevance	Namesake divergence
Predictability of cultural-success outcomes; MusicLab paradigm	Phase 10b train-on-2004–2014, test-on-2015–2024 model panel (`predictability_ceiling.md`): full-feature logistic AUC 0.999, PR-AUC 0.881, P@100 = 0.98	Salganik, M. J., Dodds, P. S., & Watts, D. J. (2006). Experimental study of inequality and unpredictability in an artificial cultural market. Science, 311(5762), 854–856.	10.1126/science.1121066	Establishes the framing of "ceiling" prediction in cultural domains: even with full information about the past, future hits remain partially unpredictable.	Our high AUC is partly an artifact of structural features (prior popularity, phonetic density) constraining the possibility space — a divergence from Salganik's experimental setup, where path dependence on early plays is the main predictability bottleneck. We flag this in the report's interpretation rather than claiming we have broken Salganik's ceiling.

I. Saturation, exposure, and reactance — the Blockbuster Paradox (`PHD_STUDY_SPEC` §1, Phase 8b)

Coordinate	Namesake measure	Primary source (APA)	DOI / URL	Relevance	Namesake divergence
Hill equation as a saturation form for exposure → response	Phase 8b Hill fits in `blockbuster_paradox_report.md` (revenue: $E_{\max}$, EC50, h; reactance term γ)	Hill, A. V. (1910). The possible effects of the aggregation of the molecules of haemoglobin on its dissociation curves. Journal of Physiology, 40, iv–vii.	Wiley	Original Hill curve; the analytical scaffold for the saturation hypothesis in name adoption.	We add a reactance term to the standard Hill curve to test for non-monotonicity at high exposure — a structural deviation from Hill's one-way saturation.
Modern review and parameterization of the Hill equation	Same	Goutelle, S., Maurin, M., Rougier, F., Barbaut, X., Bourguignon, L., Ducher, M., & Maire, P. (2008). The Hill equation: a review of its capabilities in pharmacological modelling. Fundamental & Clinical Pharmacology, 22(6), 633–648.	10.1111/j.1472-8206.2008.00633.x	Practical guide we follow for parameter identification, especially the Hill coefficient h.	We accept that h < 1 is concavity, not paradox; the paradox claim is only triggered by a significant negative reactance γ — which the current sample does not support.
Adstock / advertising memory	Conceptual motivation for the Hill+exposure framing (PHD_STUDY_SPEC §1)	Broadbent, S. (1979). One way TV advertisements work. Journal of the Market Research Society, 21(3), 139–166.	—	Marketing-mix-modeling source for the decay-of-exposure assumption that exposure has a half-life and a saturation shoulder.	We use the adstock framing only metaphorically; the actual decay structure on names comes from Phase 7b Hawkes fits, not an adstock fit.
Psychological reactance as the mechanism behind the Blockbuster Paradox	Reactance term γ in the Hill+reactance specification (`blockbuster_paradox_report.md`)	Brehm, J. W. (1966). A theory of psychological reactance. Academic Press.	—	Provides the theoretical reason a parent might avoid a name precisely because it has become ambient — the missing piece in a pure-saturation account.	We test reactance empirically against null and find it not significant in the current sample (γ p > 0.05 across exposure measures); the Blockbuster Paradox remains a hypothesis with negative evidence rather than a confirmed finding.

J. Survival analysis (`PHD_STUDY_SPEC` §1, Phase 9 / future work)

Coordinate	Namesake measure	Primary source (APA)	DOI / URL	Relevance	Namesake divergence
Cox proportional-hazards model	Planned analysis of name "lifespan" after a cultural event (`PHD_STUDY_SPEC` §1, supporting role)	Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society. Series B (Methodological), 34(2), 187–220.	10.1111/j.2517-6161.1972.tb00899.x	Standard hazard-model machinery for time-to-decay analyses; the "half-life" intuition in the cultural-diffusion essay would be formalized here.	Currently scoped but not yet fit; flagged in the preregistration deviations log because the Phase 7b Hawkes half-life partially substitutes for the Cox lifespan estimand.

K. Multiple-testing control

Coordinate	Namesake measure	Primary source (APA)	DOI / URL	Relevance	Namesake divergence
Benjamini–Hochberg false-discovery-rate procedure	`granger_results.parquet` columns `pvalue_bh_adjusted`, `granger_significant_05_bh` (42 names / 1.0% survive at α=0.05 vs 901 / 21.5% raw)	Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1), 289–300.	10.1111/j.2517-6161.1995.tb02031.x	Provides the standard FDR control we apply across the ~5K per-name Granger family.	We persist both raw and BH-adjusted columns in the canonical artifact and route smart-list product features through the BH-adjusted gate, which is unusual; most published analyses report only one.

L. Public data sources referenced as research instruments

These are not theoretical works but they are the empirical substrate; the methodology section of every report eventually grounds out here.

Source	Namesake use	Citation
SSA national + state name files	`external/ssa_national_year.parquet` (2,149,477 rows), `external/ssa_state_year.parquet` (6,600,640 rows).	U.S. Social Security Administration. (2025). Beyond the top 1000 names [Data set]. ssa.gov
Google Trends	`external/google_trends_*` parquets; weekly fetcher in `cloud/modal_app.py::fetch_search_trends_weekly`.	Google. (2026). Google Trends. trends.google.com
GDELT 2.0	`external/gdelt_name_mentions.parquet`.	GDELT Project. (2024). The GDELT Project, version 2.0. gdeltproject.org
OMDb API	Spike-attribution evidence in `scripts/attribute_spikes.py`.	Fritz, B. (2024). OMDb API: The Open Movie Database. omdbapi.com
TMDb API v3	Cast / role metadata in `tmdb_titles.parquet`, `tmdb_cast.parquet`.	The Movie Database. (2024). TMDb API, version 3. themoviedb.org
Google Books Ngrams	`external/google_ngrams_names.parquet` (5.8M token rows) for pre-internet baseline.	Google Books Ngram Viewer. (2020). Google Books Ngram Corpus. ngrams datasets v3

M. What Namesake adds — gap analysis

Each row in this section is a place where the literature sets up a question the field has not answered with this depth of data, and where Namesake's pipeline produces the missing number.

Gap in the literature	What Namesake contributes	Where in the corpus
Lieberson 2000 vs Hahn–Bentley 2003 leave the question open: how much of observed turnover is drift, how much is cultural? Both sides argue from population-level statistics.	A per-name-year null-vs-cultural classification on 46,412 names with 4 buckets (drift-consistent → strongly-cultural). 4.57% of name-years beat neutral drift at p95; 5.27% beat the phonetic null.	`phase5_null_model_results.md`
Berger et al. 2012 establishes phonetic spillover at the population level but does not measure how much cultural mass leaks from the focal name to its acoustic cousins for specific events.	Welch t = 7.21 across 166K phonetic pairs; Phase 6 produces the focal share and spillover ratio per attributed event.	`phase6_phonetic_spillover.md`; planned per-event extension in Phase 8
Granger 1969 / Sims 1980: the literature has applied these to macroeconomic series for decades but rarely to attention → behavior conversion at scale.	4,185 per-name Granger tests with FDR correction; pooled IRF with subsample decomposition that uncovers an aspirational (+) vs cautionary (−) sign split by event_type.	`phase7a_granger_causality.md`
Salganik 2006 sets a predictability ceiling in an experimental cultural market but the natural-experiment version was not feasible without the Trends + SSA join.	Train-on-2004–2014 → test-on-2015–2024 model panel with full-feature logistic at AUC 0.999, deliberately presented as a structural-features upper bound rather than a Salganik-paradigm refutation.	`predictability_ceiling.md`
Bass 1969 / Goel 2016: broadcast-vs-peer typology is widely cited but rarely fit at corpus scale on a single domain.	60,470 Bass fits stratified by event_type; sports moments are the most peer-dominant (47.0% peer); animation is the most broadcast-dominant.	`phase7c_bass_diffusion.md`
Marketing-mix modeling treats exposure → outcome as a Hill curve but rarely tests a reactance term.	Hill+reactance specification across three exposure measures (revenue, spike magnitude, vote count); reactance non-significant in the current sample, which is itself a publishable null.	`blockbuster_paradox_report.md`
Most diffusion studies treat events as exogenous binary shocks.	Synthetic-controls design with per-event placebo p-values, planned for the 200 best-attributed events in Phase 8a, with covariate set documented in `PHD_STUDY_SPEC` §4.	Phase 8 (in flight at time of writing)

Companion files: bibliography.bib (BibTeX), PHD_STUDY_SPEC.md (master design), PIPELINE_STATUS.md (artifact freshness), and the per-phase reports under reports/.

Literature Map — Namesake Research Program (Cultural Diffusion)

A. The Lieberson neutral-fashion debate (PHD_STUDY_SPEC §1, Phase 5)

B. Phonetic neighborhoods and sound-driven fashion (PHD_STUDY_SPEC §1, Phases 3 + 6)

C. Cultural diffusion theory (PHD_STUDY_SPEC §1, Phase 7c)

D. Self-exciting processes and contagion dynamics (PHD_STUDY_SPEC §1, Phase 7b)

E. Time-series causal inference — Granger causality and panel VAR (PHD_STUDY_SPEC §1, Phase 7a)

F. Synthetic-control causal inference (PHD_STUDY_SPEC §1, Phase 8)

G. Spatial autocorrelation (PHD_STUDY_SPEC §1, Phase 10a)

H. Predictability ceiling (PHD_STUDY_SPEC §1, Phase 10b)

I. Saturation, exposure, and reactance — the Blockbuster Paradox (PHD_STUDY_SPEC §1, Phase 8b)

J. Survival analysis (PHD_STUDY_SPEC §1, Phase 9 / future work)