What is PeopleAnalyst?

PeopleAnalyst is the front door for people-analytics research: 205+ works indexed and profiled, 40+ citation-grade findings extracted, and peer-reviewed behavioral science translated from academic to actionable — the missing manual for the people analytics you always meant to do.

What is people analytics?

People analytics is not a dashboard. It is behavioral science and statistical inference applied to workforce decisions — a discipline with its own methodology, spanning measurement, organizational design, talent, leadership, and analytics craft.

Why does AI in HR need measurement science?

AI is being deployed in high-stakes people decisions — hiring, performance, attrition — without the measurement science to evaluate whether it works or whom it harms. Construct validity, effect sizes, and criterion validity are the vocabulary for asking an AI vendor the right questions.

How is the research made accessible?

The evidence is indexed and searchable: 205+ works, 40+ citation-grade insight cards, and 8 research arcs, so the right finding reaches the right decision at the right time.

What separates good people measurement from assertion?

Good measurement has a method: construct validity, reliability, and effect-size interpretation are not optional — they are what separates evidence from assertion.

← The PeopleAnalyst Guide to Work Rules·Ch 09

Building a Learning Institution

What Bock argues

Corporations spend enormous sums on training and get startlingly little back — and Bock's response is to rebuild learning around two ideas. First, teach internally: your best practitioners are your best teachers (Google's "Googler-to-Googler" model), and peer-taught, job-embedded learning beats the outsourced workshop. Second, judge learning by behavior change, not satisfaction — the workshop everyone "loved" that changed nothing is the default failure mode, and a learning institution measures whether people actually do something different afterward. The frame is deliberate practice over passive consumption, and evidence over the feedback form.

The instinct — that most training is theater measured by the wrong instrument — is exactly what the training-evaluation literature has been saying for decades, mostly to a room that wasn't listening.

What the research actually says (and where 2015 needs an update)

The measurement problem is the heart of it, and it has a canonical map: Kirkpatrick's four levels — Level 1 reaction (did they like it), Level 2 learning (did they know more after), Level 3 behavior (did they do it differently on the job), Level 4 results (did the org outcome move). The decades-old, widely-ignored finding is that almost everyone measures Level 1 and stops — the "smile sheet" — and Level 1 is nearly uncorrelated with the levels that matter. People can love a course and change nothing; they can resent one and transform. Thalheimer and others have hammered the smile-sheet's emptiness; the transfer problem (learning in the room that never reaches the job) is where training value actually leaks. A learning institution lives or dies at Level 3.

On the practice side, Bock leans on deliberate practice (Ericsson) — expertise from effortful, feedback-rich, repeated practice at the edge of ability, not from hours merely logged. The concept is real and useful as a design principle, but the Guide must carry the finding that should make any corporate L&D function cautious: Macnamara and colleagues' meta-analysis found deliberate practice explains far less of performance than Ericsson claimed, and — critically for a workplace book — its explanatory power is lowest exactly where we work. Deliberate practice accounted for roughly 26% of performance variance in games, 21% in music, 18% in sports, 4% in education, and under 1% in the professions.¹ The professions are the relevant column, and there it explains almost nothing — because professional performance is messy, multiply-determined, and far from the clean feedback loops of chess or scales. So use deliberate practice to design learning (effortful, specific, fed back), but do not promise it will manufacture experts at work; in the domain this book is about, most of the variance lives elsewhere.

Where 2015 needs the update: AI tutoring and on-the-job skill measurement are the first technologies that make Level-3 measurement and feedback-rich practice cheap at scale — a personalized practice partner that gives immediate, specific feedback is deliberate practice's enabling tool. The discipline is the program's own longitudinal test, stated honestly: are people measurably better at the real work with the tool than without it, in six months? Not "did they like the AI tutor" (the smile sheet reborn) — did their behavior change and hold. The Level-1 trap is exactly the trap AI-learning hype is walking into now.

How you run it

Learning-transfer measurement at the behavior level. Define the on-the-job behavior the learning should change, and measure that — Level 3 — not satisfaction, not a quiz.
Pre/post with a control. A trained group vs a comparable untrained one, before and after, so the change is attributable to the learning and not to everything else (ties to Chapter 12's experiment discipline).
Feedback-richness audit. Is the "learning" actually deliberate practice (effortful, specific, fed back) or passive consumption with a certificate?

The analysis you can execute

A learning-impact / transfer analysis — one of the genuinely net-new builds the chapter map flags: behavior-level outcome definition, pre/post-with-control estimation (reuse calculus for the effect + honest CIs and forecasting to decide whether the evaluation is even worth running, per Chapter 12). The deliverable is the number almost no L&D function produces: did behavior change, attributably, and did it hold.

The AI-era turn

Use AI for the two things it makes cheap: feedback-rich deliberate practice (a tireless practice partner) and Level-3 measurement (observing whether the behavior actually changed on the job). Judge both by the six-month behavior test, not by satisfaction. The failure mode is importing the smile-sheet into the AI era — "engagement with the learning app" as the metric. Engagement is Level 1 wearing a dashboard.

What to do Monday

Pick one training investment and ask the Level-3 question: what behavior was it supposed to change, and did it? If you can't answer, you've been buying smile sheets.
Run one pre/post-with-control evaluation — even crude — so you have one attributable learning result.
Redesign one course as deliberate practice: effortful, specific, immediate feedback — and drop a passive module that only earns a certificate.
For any AI learning tool, set the metric to the six-month behavior test, not app engagement.

Cross-refs: Ch 12 (experiment discipline — pre/post-with-control, VoI on the evaluation); Ch 13 (report the learning programs that didn't work); the Penwright longitudinal test (better with it than without it, in six months).

Macnamara, B. N., Hambrick, D. Z., & Oswald, F. L. (2014). Deliberate practice and performance in music, games, sports, education, and professions: A meta-analysis. Psychological Science, 25(8), 1608–1618. Domain estimates as reported; the Ericsson (2016) reply contests the definition/measurement — the live debate is itself the reason to treat deliberate practice as a design principle, not a law. ↩