peopleanalyst

magazine · Case study · Compensation

A compensation cycle in three movements — and why distribution and payout are not the independent cost drivers most models treat them as.

By Mike West

May 9, 2026

Before the Ratings

A compensation cycle in three movements — scenario modeling, Monte Carlo simulation, regression surrogate calculators — and why distribution and payout are not the independent cost drivers most models treat them as.


It is fall 2025. I am working with a large news media company, sitting in front of an annual-incentive question that has the wrong shape. The senior leaders need to set program parameters for the next compensation cycle. The financial year-end is months away. Performance ratings will not exist for another sixty days. They want a number.

The number they want is whether the company should fund the individual-performance component of the Annual Incentive Plan at the historical level — the level it has been funded at for years, and underspent for years — or reduce it. The reduction would free a single-digit-million pool that they want to redirect into the Restricted Stock Unit program to recognize exceptional performers more visibly. The financial calculus is, on its surface, simple. Multiply headcount by payout rate. Compare to budget. Adjust.

I have done compensation modeling for fifteen years and I know that this calculus is wrong. It is wrong in a particular way that is hard to explain in a meeting because the wrong way looks correct from outside. The cost of paying out a performance program is not the headcount times the rate. It is jointly determined by — and this is where most analytical work in this domain goes off the rails — at least four interacting variables, none of which is fully known, and at least one of which is normally not even measured.

The job, before any modeling can usefully begin, is to surface that to the people in the room.


The arithmetic that isn't arithmetic

Compensation pools at this kind of program design behave the way they do because they are fixed-budget, variable-rate systems. The total dollars available are set by a funding decision; the per-rating payout percentages are partly set and partly variable; the population is changing; and the performance distribution underneath the population is uncertain. The interaction of those four conditions produces a cost surface that does not respond linearly to any one input.

Four variables drive the outcome:

Performance distribution. How many employees fall into each rating category — Exceptional, Successful, Mixed, Unsatisfactory, Too Soon to Rate. More Exceptional ratings consume more budget per unit; fewer Exceptional ratings leave headroom. The shape of the distribution per unit determines what the variable component costs.

Compensation distribution within ratings. This is the variable that almost nothing in the analytical-decks-and-dashboards genre tracks well. Two units with identical rating distributions can produce materially different budget impacts if the high-rated employees in one unit are concentrated at the top of the salary band while the other unit's high-rated employees sit at the median. That is not an edge case. It is the dominant source of inter-unit variance in this kind of program. It is the most under-appreciated factor in compensation modeling.

Organizational composition. Who is actually in each unit, what they earn, what their target bonus percentage is — and the fact that this changes weekly through the planning cycle as transfers, terminations, and new hires occur. The model that runs on Monday's data is not running on Friday's data, and Friday is the data the executives will see.

Budget allocation. Per-unit budget allocations that are themselves subject to revision as the cycle progresses, often as late as the week before sign-off.

The interaction of these four means that small changes in any one dimension produce outsized or counter-intuitive changes in the derived per-unit Exceptional payout. This is not a spreadsheet limitation. It is a structural feature of fixed-budget, variable-rate compensation programs, and the field's habit of treating the four variables as independent — distribution drives this much, comp mix drives that much, add the contributions, you have your answer — produces sensitivity numbers that are predictably wrong.

The job is to build something that surfaces the joint behavior, not the additive approximation.


Movement one: scenario modeling

The first movement is deterministic. Two design shapes are placed side-by-side, and the executive conversation becomes a comparison instead of a guess.

Scenario A — fixed accrual, variable Exceptional payout. Set the accrual at the lower funding level. Set Successful, Mixed, and Unsatisfactory at fixed percentages. Cap the Exceptional payout. Then let the per-unit Exceptional payout fall out of the math — whatever the budget, after the fixed-rate costs, supports for that unit's Exceptional population.

Scenario B — variable accrual, fixed Exceptional payout. Set Exceptional at a single fixed percentage company-wide. Set the other rates as in A. Then ask: what funding level is required to support this?

The two scenarios are not interchangeable. They formalize a real choice the executives have to make — do you accept that some units will pay out at higher Exceptional rates than others, or do you commit to a flat company-wide rate and absorb the cost variation through funding instead? — and they make the trade-offs visible. Neither is "right." The contrast is the deliverable.

The scenario tabs are formatted for executive presentation. Per-unit views inside per-SVP-organization views inside the company-wide view. The same model parameters; three levels of aggregation; one consistent set of derivations. This is the part of the work that most resembles what the field thinks compensation modeling is. It is the smallest part of the work.

Movement two: Monte Carlo light

Scenario modeling answers parameter questions if you accept the assumed performance distribution. The honest answer to "what's the Exceptional payout going to be?" is that it depends on a distribution that does not yet exist, and any single set of assumptions about that distribution is speculative.

Monte Carlo is the response. A partially randomized algorithm applies simulated performance ratings to the actual employee population, respecting organizational structure and plausible distribution constraints. The model recalculates all payout outcomes for the simulated assignment. An Apps Script macro automates the loop — toggle a recalculation trigger, capture the output range, append the trial's results to a cumulative trials sheet, repeat in batches of twenty-five. Fifty or more trials accumulate per analysis run.

This is intentionally Monte Carlo light. The simulation runs in Google Sheets because keeping the analysis in the same surface as the scenario tabs preserves auditability. Stakeholders can inspect the assumptions directly. Nothing depends on a model nobody in the meeting can examine.

The output is what changes the conversation. Instead of "if the distribution is X, the per-unit payout is Y," the output is "across fifty plausible distributions, the per-unit payout is between Y₁ and Y₂, with median Y₃." The conversation moves from "what is the answer" to "the answer exists on a range — here is the likely outcome, here is how far it could deviate, here are the units where the spread is widest." That is the right shape of conversation for a parameter-setting decision being made before the data exists.

The simulation surfaces something the deterministic model could not. Some units' outcomes are tightly clustered — their Exceptional payout will land in a narrow band almost regardless of how the ratings come in. Other units' outcomes are widely dispersed — their Exceptional payout could land anywhere across a forty-percentage-point range depending on how the ratings sort. The dispersion is itself information. Wide-dispersion units are units whose ratings the model cannot predict from structural features alone, and the executives need to know that those units' payouts are inherently more uncertain. Narrow-dispersion units are structurally constrained — whatever the ratings, those units will pay out near a specific point — and decisions about those units can be made with higher confidence.

Movement three: regression surrogate calculators

Running the Monte Carlo is slow. A batch of twenty-five trials takes several minutes; answering a new what-if question typically requires re-running the simulation. Executives in a meeting do not have several minutes per scenario. They have moments, and the moments compound.

The regression surrogate calculator is the answer. A regression is fitted to the accumulated trial data, producing a closed-form equation that approximates the simulation's behavior. A user changes the segment, the performance distribution, or the budget parameters and immediately sees the predicted outcome — without touching the simulation workbooks.

Two calculators are built:

Forward. Given a desired Exceptional payout percentage and an assumed distribution, predicts the total program cost. Includes a sensitivity table stepping payout from the lower bound to the upper bound in one-percentage-point increments, with fundability flags showing exactly where each segment crosses its target budget.

Inverse. Given a desired total cost and the same assumed distribution, solves for the Exceptional payout that produces that cost. The inverse model exploits the structure of the cost equation, which decomposes into a baseline term (the part of the cost that does not depend on the Exceptional payout) and a slope term (the Exceptional-dependent part). With baseline and slope known, the desired cost equation solves linearly for the payout. A 95% confidence band is propagated through by replacing the desired cost with cost minus 1.96·RMSE and cost plus 1.96·RMSE and solving at each bound.

The calculators run instantly. In a meeting, an executive can ask: "What if Exceptional is 140% in the editorial division? What does that cost?" and the forward calculator answers before the next sentence. "We have a small additional budget for that division — what payout does it support?" and the inverse calculator returns the answer with a confidence band. The Monte Carlo machinery is still load-bearing — the trial data is what the regression is fitted to — but it lives behind the calculator. The conversation surfaces fast.

The non-obvious analytical move in the regression model is the interaction terms. The cost equation includes not just the main effects of distribution share and payout level but the products of share and payout — (share of Mixed) × payout, (share of Successful) × payout, (share of Exceptional) × payout. Without these, the model would assume that shifting one percent of employees from Successful to Exceptional has the same cost impact regardless of whether Exceptional is paid at 120% or 180%. That is wrong. The interaction terms capture the reality that the cost sensitivity to distribution shifts depends on the payout level, which is exactly the non-linearity that makes the problem hard to reason about intuitively.

Most compensation models I have inherited from other analysts and consultancies treat distribution and payout as independent cost drivers. They produce sensitivity numbers for one variable at a time, additively. They are predictably wrong, in a direction that systematically underestimates the cost of high-payout decisions in high-Exceptional-share units. The interaction terms are the principal issue this whole modeling stack exists to surface.

The regression fit varies meaningfully by segment. Some divisions show very high R² — the relationship between payout parameters and cost is highly predictable there, because distributional structure is stable and salary distribution is tight. Other divisions show much lower R² — factors outside the model, especially the comp-mix-within-ratings variable from the four-variable list, introduce variance that a linear model of aggregate distributions cannot capture. The low-R² result is itself a finding. It tells the executives: "For this division, you should apply wider uncertainty margins than for the others, because the structural variance is genuinely larger." That is information the field's usual deck-shaped output does not surface.


What nobody scoped

There is a problem upstream of all of this that nobody put on the project plan, and that nearly killed the work midway through.

The organizational groupings the executives use — "everyone who reports to the CTO," "the editorial division minus the union members," "the cross-functional engineering organization across product, design, and engineering" — do not exist as recorded fields in the HRIS. The HRIS records reporting hierarchy and cost center and job code. The analytical groupings the business cares about are constructions: compound criteria that say take everyone reporting under this executive, except the ones in these cost centers, except the union members, plus these specific employees who are organizationally elsewhere but functionally part of this unit, and place them in this analytical bucket.

Prior-year mapping files exist. They are not reliable as rules. Organizations change names, restructure, absorb sub-teams, lose them. Apply the prior year's mapping logic to this year's HRIS export and dozens of employees fall out of joins. Others get incorrectly assigned. Many of the cost centers no longer exist or have been consolidated or split. The prior year's mapping is a pattern you can use to infer the intended logic; it is not a lookup table you can apply.

Each mapping is iterated through multiple conversations with the client. Sometimes at the level of individual employees. "This person reports under that executive, but functionally she's part of this other unit." The correct logic emerges through the conversations, not from a single specification.

This problem is upstream of everything else in the model. If the mapping is wrong, every scenario, every simulation, and every budget comparison built on top of it is wrong. When the executives ask "what does the editorial division cost," the entire chain of analytical reasoning depends on what editorial division means in the model's mapping table, and what it means is determined by half a dozen compound criteria nobody has ever written down.

In the production model, the mapping is encoded in formulas across multiple sheets: a hierarchy lookup, a department override, a manager override, an individual employee override. The cascade resolves in priority order — if the hierarchy assigns someone correctly, you use that; if it doesn't, you fall through to the department override; if that doesn't catch them, the manager override; finally, an individual employee override for the cases nothing else handles. By the end of the cycle there are several thousand individual overrides — one per employee whose mapping cannot be inferred from any rule.

The product-design lesson, which I am writing in this magazine and would put in any handoff to a successor, is this: a future application of this work should treat the organizational mapping layer as a first-class, client-editable feature. A dynamic mapping table the business can modify on the fly. Because it directly impacts every downstream outcome, and because the mappings will change next cycle, and the cycle after that, and the analyst who inherits the work will not know which mappings are still right and which need to be rebuilt. The data the HRIS records is recorded that way for operational reasons; the analytical groupings are not how the business thinks about itself, and the gap between those two is where the work lives.


The decision

Two scenarios converge on the same funding answer. Scenario A — fixed accrual, variable Exceptional — produces sensible per-unit Exceptional payouts at the lower funding level, with the cap binding only in the most concentrated-Exceptional units. Scenario B — variable accrual, fixed Exceptional — also produces the lower funding level as the required level. Both shapes of the question, asked independently, return the same answer. That convergence is the validation.

The decision: fund at the lower level. Reallocate the freed budget into the RSU program to fund a company-wide Exceptional multiplier on top of target awards, with the divisional flexibility that lets specific units pay above the multiplier when their structural fundability allows it. Per-unit Exceptional AIP payouts vary by department within the cap — preserving the unit-level variation rather than papering it over with a flat company-wide rate.

The decision shape preserves what the model surfaced. The conversation in the room was "different units have structurally different fundability, and a flat company-wide payout requires either cross-unit redistribution or accepting that some units will pay below their structural rate while others pay above it." The chosen design accepts the variation, names it explicitly, caps the upper bound, and uses the freed RSU pool to deliver the company-wide-recognition signal through a separate vehicle.

This is the kind of decision that happens because the analytical work surfaced a structural feature the field's usual analysis would have hidden. The dominant practice in this domain is to set a flat company-wide payout because it feels equitable and it is administratively simple. The principal issue with that practice is that it does not match the cost structure underneath the program, and it generates either persistent under-funding in some units (with executives quietly complaining that "we never have enough budget for our high performers") or persistent over-funding in others (which the audit later flags). Naming the variation and managing it explicitly is what the analytical work earns.


What happened next

The modeling produced the parameters. Translating them into actual payments was a different workstream with a different risk profile, and that workstream is where most of what could go wrong actually goes wrong.

Three teams coordinate the execution. The People Analytics team — me — owns the modeling and the file generation that drives the compensation system. Corporate Compensation owns the sanity checks and the executive sign-off chain. The Shared Services Center compensation team and HRIS own the system configuration, the sandbox and production environments, and the integration loads. The seven-phase workflow runs from December through late January, with the most critical work compressed into a two-week window in mid-January during which models have to be re-run with production data, system-load files have to be created and uploaded and validated, and any upstream delay cascades through the chain.

Several truths about this phase are not optional. They are structural.

The data sources proliferate. Multiple Workday reports with similar names exist. The reports used for modeling are not always the same as the reports driving the compensation tool. The performance ratings field is dynamic — the same field name returns different underlying data depending on the date you pull it, because the source-of-truth resides in different parts of the system at different times in the cycle. Confirming which report is authoritative for each data element requires explicit coordination at the start of each cycle, in writing, with the team that controls the system.

Late-changing data is the norm, not the exception. Performance ratings continue to be finalized and corrected after the snapshot date. Some changes occur on the last day before data has to be submitted for payroll processing. Organizational changes — transfers, terminations, new hires — continue altering unit composition and budget allocation throughout January. The model that produced the executive deck on January 9 is not the model that runs on January 19, and the model that runs on January 19 is not the model that produced the final compensation statements at the end of February.

Executive approval is a bottleneck, not a formality. The original target date for executive sign-off is mid-January. In practice, the sign-off is delayed by an emergency that pulled the relevant decision-maker away. The contingency plan triggers — four escalating scenarios depending on when the approval lands — and the configuration team spends the weekend compressing what was supposed to be a multi-day testing window into thirty-six hours. Monday's launch is preserved by weekend work, but the contingency cost is real.

Post-submission ratings changes cause statement mismatches. A small number of employees — fewer than ten — have ratings changed after the formal submission deadline, after the change submitted by the performance team but not propagated to the compensation tool, and the compensation statements that arrive on those employees' desks reflect the old rating instead of the new one. The fix is manual correction. The retrospective notes that "ratings submissions need to be shut down sooner" — a process gap, not an analytical one, but one that the analytical work will own the visibility of, because the analyst is the one who can detect the mismatch by cross-referencing two reports.

The case generalizes. Modeling and execution are different workstreams with different risk profiles, and most of the failure happens at the seam between them. A model that produces beautiful scenario outputs but does not survive the seam is a model that did not ship. A future application of this work — and I would commend this to anyone designing a successor — should be designed for the seam as much as for the modeling. EIB-format generation built into the tool. Sanity-check validation built into the workflow rather than performed manually after the fact. Multi-environment awareness from the start. Late-changing data accommodated without requiring a full rebuild of the model.


What this case generalizes to

Two principles, one structural and one operational.

Distribution and payout are not independent cost drivers. They multiply. Any compensation framework that treats them as additive — and most do — produces predictably wrong sensitivity numbers, in a direction that systematically underestimates the cost of high-payout decisions in high-Exceptional-share units. The interaction terms are the structural non-linearity. Surfacing them explicitly, in a model that is auditable by the people in the meeting, is what turns the analysis from a calculation into a decision tool.

The modeling and the execution are different workstreams. Most of what calls itself compensation modeling treats the modeling as the work and the execution as a downstream administrative step. That framing is wrong. The execution phase is where data sources proliferate, where late-changing inputs cascade, where executive approval delays compress configuration windows, where post-submission corrections create statement mismatches. The analytical work that survives the seam is the work that planned for the seam from the start.

The principal issue most compensation work misses is not analytical sophistication. It is structural fit. The field is full of good arithmetic in the wrong shape — additive sensitivities where the cost surface is multiplicative; deck-shaped outputs where the conversation needs ranges; modeling pipelines that stop at the executive deck and hand off to administration with no plan for the seam. The work that ships is the work that names what the obvious framing misses, surfaces it in a tool the people in the room can interrogate, and survives the operational chain from sign-off to payroll.

In one cycle at one company, this is what the work looked like. In the next cycle, the parameters will differ; the population will have shifted; the unions will have renegotiated their contracts; the international locations will have re-grouped; the executive team will have new people. The analytical framework — scenario modeling on the fixed-vs-variable trade-off, simulation-based range estimation, regression-based surrogate calculators with explicit interaction terms, organizational mapping as a first-class editable feature, execution-aware design — generalizes. The specific numbers will not.

That is the case I would commend to anyone doing this kind of work: do not deliver a number; deliver the structure underneath the number, and the tool the executives can use to interrogate the structure when the next cycle's number is the one that matters.

Anchored in

← All magazine pieces