What is PeopleAnalyst?

PeopleAnalyst is the front door for people-analytics research: 205+ works indexed and profiled, 40+ citation-grade findings extracted, and peer-reviewed behavioral science translated from academic to actionable — the missing manual for the people analytics you always meant to do.

What is people analytics?

People analytics is not a dashboard. It is behavioral science and statistical inference applied to workforce decisions — a discipline with its own methodology, spanning measurement, organizational design, talent, leadership, and analytics craft.

Why does AI in HR need measurement science?

AI is being deployed in high-stakes people decisions — hiring, performance, attrition — without the measurement science to evaluate whether it works or whom it harms. Construct validity, effect sizes, and criterion validity are the vocabulary for asking an AI vendor the right questions.

How is the research made accessible?

The evidence is indexed and searchable: 205+ works, 40+ citation-grade insight cards, and 8 research arcs, so the right finding reaches the right decision at the right time.

What separates good people measurement from assertion?

Good measurement has a method: construct validity, reliability, and effect-size interpretation are not optional — they are what separates evidence from assertion.

Research Pipeline — Status & Next Steps

Auto-generated: 2026-05-19 12:10 UTC by scripts/python/research/report_pipeline_status.py

This report is rewritten each time npm run research:status is invoked. Every number is sourced from actual parquet metadata or Supabase SELECT COUNT(*). Do not edit this file by hand — edits will be overwritten.

Summary

Phase	Status	Detail
Phase 0: Scaffold	✅ Complete	Scripts scaffolded
Phase 1: Internal snapshot	⚪ Not Started	Missing required files: raw/name_enrichment.parquet, raw/name_stats.parquet, raw/name_rank_history.parquet, raw/name_search_trends.parquet
Phase 2: External data acquisition	⚪ Not Started	Missing required files: external/cmu_pronouncing_dict.parquet, external/ssa_national_year.parquet, external/google_ngrams_names.parquet
Phase 3a: Phoneme decomposition	⚪ Not Started	Missing required files: derived/name_phonemes.parquet
Phase 3b: Phonetic neighborhood graph	⚪ Not Started	Missing required files: derived/phonetic_neighbors.parquet
Phase 4: Panel construction	⚪ Not Started	Missing required files: derived/annual_panel.parquet, derived/weekly_panel.parquet
Phase 5: Null model (neutral drift)	⚪ Not Started	Missing required files: processed/null_model_thresholds.parquet
Phase 6: Phonetic spillover	⚪ Not Started	Missing required files: processed/phonetic_spillover_results.parquet
Phase 7: Timeseries (Hawkes/Bass/Granger)	⚪ Not Started	No outputs yet
Phase 8: Causal analysis	⚪ Not Started	No outputs yet
Phase 9: Heterogeneity decomposition	⚪ Not Started	No outputs yet
Phase 10: Geographic + predictability	⚪ Not Started	No outputs yet
Phase 11: Final report	⚪ Not Started	No outputs yet

Data Files

File	Phase	Status	Rows	Size	Updated
`raw/name_enrichment.parquet`	Phase 1	❌	—	—	—
`raw/name_stats.parquet`	Phase 1	❌	—	—	—
`raw/name_rank_history.parquet`	Phase 1	❌	—	—	—
`raw/name_search_trends.parquet`	Phase 1	❌	—	—	—
`raw/name_spike_events.parquet`	Phase 1	❌	—	—	—
`raw/name_cultural_events.parquet`	Phase 1	❌	—	—	—
`external/ssa_national_year.parquet`	Phase 2	❌	—	—	—
`external/ssa_state_year.parquet`	Phase 2	❌	—	—	—
`external/cmu_pronouncing_dict.parquet`	Phase 2	❌	—	—	—
`external/google_ngrams_names.parquet`	Phase 2	❌	—	—	—
`external/gdelt_name_mentions.parquet`	Phase 2	❌	—	—	—
`external/cdc_natality_monthly.parquet`	Phase 2	❌	—	—	—
`external/place_names.parquet`	Phase 2	❌	—	—	—
`external/babynames_tidy.parquet`	Phase 2	❌	—	—	—
`external/omdb_titles.parquet`	Phase 2	⚪	—	—	—
`derived/name_phonemes.parquet`	Phase 3a	❌	—	—	—
`derived/phonetic_neighbors.parquet`	Phase 3b	❌	—	—	—
`derived/annual_panel.parquet`	Phase 4a	❌	—	—	—
`derived/weekly_panel.parquet`	Phase 4b	❌	—	—	—
`derived/event_panel.parquet`	Phase 4c	❌	—	—	—
`processed/null_model_thresholds.parquet`	Phase 5	❌	—	—	—
`processed/phonetic_spillover_results.parquet`	Phase 6	⚪	—	—	—
`processed/phonetic_clusters.parquet`	Phase 6	⚪	—	—	—

Database Spot Check

Table	DB Rows	Parquet Rows	Divergence	Status
`name_enrichment`	—	—	—	⚠️ NEXT_PUBLIC_SUPABASE_URL and SUPABASE_SE
`name_stats`	—	—	—	⚠️ NEXT_PUBLIC_SUPABASE_URL and SUPABASE_SE
`name_search_trends`	—	—	—	⚠️ NEXT_PUBLIC_SUPABASE_URL and SUPABASE_SE
`name_spike_events`	—	—	—	⚠️ NEXT_PUBLIC_SUPABASE_URL and SUPABASE_SE
`name_cultural_events`	—	—	—	⚠️ NEXT_PUBLIC_SUPABASE_URL and SUPABASE_SE

Blockers

Next Steps

Complete Phase 1: Internal snapshot: Missing required files: raw/name_enrichment.parquet, raw/name_stats.parquet, raw/name_rank_history.parquet, raw/name_search_trends.parquet
Run scripts/attribute_spikes.py to populate cultural events
Resume Google Trends fetch (A-025)

Master spec: PHD_STUDY_SPEC.md Operator manual: ../../scripts/python/research/README.md Auto-refreshed nightly by .github/workflows/research-status-nightly.yml