What is PeopleAnalyst?

PeopleAnalyst is the front door for people-analytics research: 205+ works indexed and profiled, 40+ citation-grade findings extracted, and peer-reviewed behavioral science translated from academic to actionable — the missing manual for the people analytics you always meant to do.

What is people analytics?

People analytics is not a dashboard. It is behavioral science and statistical inference applied to workforce decisions — a discipline with its own methodology, spanning measurement, organizational design, talent, leadership, and analytics craft.

Why does AI in HR need measurement science?

AI is being deployed in high-stakes people decisions — hiring, performance, attrition — without the measurement science to evaluate whether it works or whom it harms. Construct validity, effect sizes, and criterion validity are the vocabulary for asking an AI vendor the right questions.

How is the research made accessible?

The evidence is indexed and searchable: 205+ works, 40+ citation-grade insight cards, and 8 research arcs, so the right finding reaches the right decision at the right time.

What separates good people measurement from assertion?

Good measurement has a method: construct validity, reliability, and effect-size interpretation are not optional — they are what separates evidence from assertion.

research / pa-platform / reports

The principal-issues thesis

The headline thread — load-bearing analytics, the structure-first pipeline, and the value stack (Employee Lifetime Value → activation → Net Activated Value → opportunity), demonstrated live on a compensation command center running structured synthetic data.

By Mike West

People Analytics Toolbox·Reports

The principal-issues thesis

Every domain has a load-bearing set of questions it has to answer well to function. Most domains are stuck because they have never named theirs. This is what naming it looks like for people analytics — and the workflow that answers it, running end-to-end, demonstrated live.

There is a version of people analytics that is mostly dashboards. A number goes up, a number goes down, somebody makes a slide. The work is real and the charts are honest, but the room is never quite sure what to do with any of it, because the dashboard answers a question nobody load-bearing was asking. It tells you the headcount. It does not tell you where the next dollar of pay buys the most retained value, or which of your competitors is quietly draining your best people, or whether the team that looks fine on the survey is one departure away from a problem.

The thesis underneath this platform is that every applied domain has a small set of principal issues — the load-bearing questions it has to answer well to function — and that most domains stall out because they have never written that set down. Once you name the set, two things follow. You can build the workflow that answers it end-to-end, instead of assembling charts and hoping the meaning falls out. And you can be honest about the gap between the questions that matter and the questions your current data can actually support.

This piece names the set for people analytics, walks the workflow that answers it, and shows the value stack it produces. The workflow is live: it runs end-to-end today, demonstrated on a compensation command center. The numbers in the demonstration come from structured, synthetic data — generated to be realistic, carrying none of any real organization's confidential information. We will be explicit about that throughout, because the discipline of saying where every number came from is half of what separates analytics from decoration.

The pipeline: structure first, then calculate

The mistake most analytics stacks make is to calculate first and segment later — to compute an average and then go looking for slices of the population that explain it. That order produces noise. It lets a calculation reach back into the data and redraw the groups until the answer looks clean, which is a polite description of fooling yourself.

The workflow here runs in the other order. It structures the data first, and only then calculates downstream of that structure.

The stages, in sequence:

Raw data enters — the kind of records any HR information system holds: who works here, in what job, at what level, in what location, paid how much, for how long, performing at what level.
Augmentation for segmentation. Before anything is averaged, the raw records are enriched into the dimensions a person would actually want to cut the population by — job family and job level (what work someone does), organizational unit (where they sit), geography, tenure band, performance band. These are deliberately kept distinct. What work someone does and where they sit are correlated but not interchangeable, and conflating them produces segmentation that quietly misleads. The augmentation step is where the data gets the structure that makes every later cut trustworthy.
Calculation runs downstream. Statistics, trends, and models run against the structured population — never the other way around. The calculations consume the segmentation; they never push back into it. This one-directional rule is the whole point. Segmentation defines the groups; calculation measures them. Because the groups are fixed before the measuring starts, a result that holds up in one cut is a real result, not an artifact of where the lines happened to land.
Segmented outputs come out the far end — measured results, cut by the structure that was set up front.
Those outputs are consumed into the value measures below.

The reason this order is load-bearing rather than stylistic: it is the difference between an analysis that can be cross-examined and one that cannot. When the structure is fixed before the math, anyone in the room can ask "show me that same number for engineers, for the West region, for people under two years tenure" — and the answer is already coherent, because those cuts were defined before any number was computed.

The value stack: from a headcount to where the next dollar goes

Naming the principal issues for people analytics means naming the small set of measures that actually decide things. The platform organizes them into a stack — each measure built on the one before it, each one a step closer to a decision.

Employee Lifetime Value (ELV). The total value a role is positioned to create over the time a person holds it. It is the denominator everything else is measured against — the size of the prize at full strength.

Activation (NA%). The share of that potential value actually being realized today. A role can be worth a great deal and be running at a fraction of it — under-supported, misaligned, or held by someone halfway out the door. Activation is the honest read on how much of the prize you are actually collecting. (The team-level diagnostic underneath activation — which of capability, alignment, motivation, and support is the binding constraint on a given team right now — is the same protected-feedback machinery the Performix product is built on, consumed here as one input rather than rebuilt.)

Net Activated Value (NAV). Activation times lifetime value: NAV = NA% × ELV. The value you are actually capturing, as opposed to the value the role could theoretically produce. This is the number that should be on the wall, and almost never is.

Opportunity. Lifetime value minus the value you are capturing: Opportunity = ELV − NAV. This is the headline. Opportunity is the value present in the workforce that is not yet being collected — and crucially, it is segmented, so it tells you not just how much is on the table but where. The places with the largest opportunity are the places where activating value — through pay, through support, through unblocking the binding constraint — pays back the most. It turns a vague instinct ("we should invest in our people") into a ranked map of where investment actually returns.

Two analyses sit alongside the stack and feed it:

Leadership Quality. A composite score on a 0–100 scale, built from three honest inputs: the performance-management program's own signal, the activation/binding-constraint read on the leader's teams, and compensation stewardship — whether the leader is allocating pay in a way that is consistent and defensible. It is a composite, so no single input can carry it, and each input is itself traceable back to the structured data. The point is not to rank managers for sport; it is to locate where leadership quality is itself one of the constraints on activated value.

Attrition, from-to. Not just how many people left, but who you win talent to and who you lose it to — the from-to flows between your organization and the others in your market — and the regretted-attrition rate, the share of departures you would have paid to keep. A raw turnover number tells you almost nothing. Knowing that your strongest engineers leave for two specific competitors, and that those departures were the ones you most wanted to prevent, tells you exactly where the pay-fairness and retention conversation needs to happen.

Read top to bottom, the stack converts a headcount into a decision: here is the value present, here is the share you are collecting, here is what is left on the table, and here — by segment — is where collecting more of it pays back most.

The demonstration data is structured, not random — and it is synthetic

A workflow like this is only as convincing as the data you can show it running on. Real client data cannot go on a public page; random data produces random noise that demonstrates nothing. So the live demonstration runs on data that is deliberately a third thing: structured synthetic data — synthesized from the patterns real HR information systems exhibit, anonymized, carrying zero confidential information, with the relationships that make analytics meaningful baked in on purpose.

What "baked in" means, concretely:

Pay varies the way pay actually varies. Level differences dominate; function matters next; geography after that; tenure least — the structure compensation actually exhibits when you look at clean data. So a fairness or consistency check run against this population finds real structure, not white noise.
Exit hazard is driven, not assigned. The probability that a synthetic employee leaves is a function of their activation, their pay gap to market, their tenure, and their performance — the same forces that drive departures in the real world. So the attrition and opportunity analyses surface patterns that mean something, because the patterns were authored into the generating process rather than sprinkled on top.

The honest framing matters as much as the construction. Because the relationships are designed, the analytics produce insights instead of static — which is the point of a demonstration. And because the data is synthetic, every figure on the command center is a property of a generated population, not an outcome from any real organization. None of these numbers is a client result, and none should ever be cited as one. The demonstration proves that the workflow produces signal; it makes no claim about effect sizes in the world. Any quantitative claim about what these interventions return in a real organization would need its own sourced evidence, and the demonstration does not provide it.

Why most companies cannot do this

The principal-issues thesis has a sharp corollary: naming the load-bearing set is easy to say and hard to do, and the reason is that the workflow above has to hold together end-to-end before any of it is trustworthy. A company that has the segmentation but calculates in the wrong order gets confident nonsense. A company that has the value stack but no honest activation read gets a number on the wall it cannot defend. A company that has all of it but no discipline about where its numbers came from cannot tell the room which figures are load-bearing and which are decoration.

Most organizations have pieces. Very few have the whole pipeline — structure first, calculation strictly downstream, a value stack that ends in segmented opportunity, and the candor to say which numbers are real measurements and which are synthetic demonstration. That whole, running end-to-end, is what this platform is for. It is live now, and you can watch it run.

The compensation command center referenced throughout is a live demonstration surface running the workflow described here on structured synthetic data. The pipeline, the value stack, the Leadership Quality composite, and the from-to attrition analysis are all in production; the figures shown are properties of a generated population and are not client outcomes.