peopleanalyst

← The PeopleAnalyst Guide to Work Rules·Ch 03

Lake Wobegon (Hire Only People Better Than You)

What Bock argues

The claim is to invest more in getting people than in fixing them: spend the money at the front door, hire above your current median so the average keeps rising, and hire slowly enough to hold the bar. "Hire only people better than you" is the slogan; the substance is that a great hire compounds for years while a mediocre one costs for years, so selection is the highest-leverage dollar in talent — higher than training, higher than performance management. Bock's posture is that most companies have this exactly backwards: they rush hiring and then pour money into developing people they shouldn't have hired.

The validity case for how to select is Chapter 5. This chapter is the part Bock asserts and the field quantified: why selection is worth obsessing over — and it's a dollar figure, not a vibe.

What the research actually says (and where 2015 needs an update)

Two bodies of work carry this chapter. The first is the validity hierarchy from Schmidt and Hunter (1998) — work samples, structured interviews, and general mental ability as the strongest single predictors — which Chapter 5 handles and which the Sackett et al. (2022) reanalysis sharpened (the corrections were overstated; structure is doing even more of the work). The second, and the one this chapter is really about, is utility analysis (Boudreau & Ramstad; the Schmidt-Hunter utility tradition): the insight that the dollar value of better selection scales with three things — how valid your method is, how many people you hire, and how much performance varies between people in the role. That last term is the quiet bombshell. In roles where the gap between a good and a great performer is large (and Chapter 8 says that gap is power-law-large), even a small improvement in selection validity, multiplied across many hires held for several years, is worth a startling amount of money. "Hire above the median" stops being a platitude and becomes an investment with a computable return.

This is also where "hire slowly" earns its keep. Holding a high bar means rejecting people who would have been fine — accepting more false negatives to avoid false positives — and that trade is only rational if a bad hire is genuinely expensive, which utility analysis says it is. The slowness is not fussiness; it's the cost of keeping the selection ratio low enough that validity actually pays off.

Two honest caveats. First, utility-analysis dollar figures are notoriously sensitive to their assumptions (especially the performance-variation term, which is hard to estimate) — use them to compare options and size the prize, not to promise a board an exact number. Second, "hire people better than you" assumes you can measure "better" reliably — and Chapter 5 and the reliability program say a single unstructured interviewer can't. Selection utility is only real if the selection signal is reliable; a high-validity method run by a noisy rater throws the utility away.

Where 2015 needs the update: AI screening promises to raise the two scalable terms at once — apply a consistent method across far more applicants. If the method is valid and reliable, that's exactly the utility lever Bock wants, made cheap. If it's a single biased model (Chapter 4) applied consistently, you've scaled a negative-utility selector — confidently, at volume. The economics cut both ways, and the sign depends on the validity and reliability you bothered to measure.

How you run it

The analysis you can execute

A hiring-quality + selection-utility analysis: calculus for the utility computation and the performance-variation estimate, plus a quality-of-hire instrument (one of the four genuinely net-new builds the chapter map flags). Pair it with the reliability discipline from Ch 5 so the "better" you're selecting on is measured reliably, not off a single rater. Min-N gated on any group breakdown.

The AI-era turn

AI selection is a utility lever or a liability multiplier, and which one is an empirical question you must answer before deploying, not after. Measure the tool's validity against your own quality-of-hire outcome and its reliability as a rater (Ch 5). A valid, reliable, consistent screener applied at volume is the best version of "hire above the median" ever available. A biased one applied at volume is Chapter 4's nightmare with a spreadsheet. Same tool, opposite sign — decided by measurement.

What to do Monday

  1. Define a quality-of-hire signal you can actually compute, and start tracking it — you can't manage selection utility you don't measure.
  2. Size the prize once: a rough utility estimate (validity × volume × performance-variation) for improving one high-volume, high-variation role. Show the assumptions and a range.
  3. Treat "hire slowly" as a deliberate false-negative trade, justified by the bad-hire cost — and say so out loud, so the bar doesn't quietly erode under pressure.
  4. Before trusting an AI screener, measure its validity against your quality-of-hire outcome and its reliability as a rater. No measurement, no deployment.

Cross-refs: Ch 5 (selection validity — which methods predict); Ch 4 (the biased screener — negative utility at scale); Ch 8 (power-law performance variation — the term that makes selection utility large); content/magazine/the-reliability-problem.md (a high-validity method needs a reliable rater).