parts / capability / research-methods-rigor
Research methods & rigor
The rigor layer most analytics tools hide. A study-designer and analysis-wizard backed by a stateless compute service plus a public, citable catalog of every study design, test, assumption check, remedy, and sampling method — each grounded in the methods canon. It checks assumptions, chooses the right test, runs it, and shows the naive-vs-applied remedy moment.
Research Methods & Rigor
Type: algorithm
Origin repo(s): people-analyst (the People Analytics Toolbox) — the research-methods spoke, built on the cross-cutting stats + research-methods libraries
Extraction readiness: live (stateless compute service); the design-recommendation engine is a documented follow-up, not yet shipped
Depends on: the de-identification layer (callers post de-identified numeric vectors), the cited methods catalog, and Principia for the evidence layer (effect-size priors, the method taxonomy, the remedy knowledge base) — never for the computation itself
Last reviewed: 2026-06-08
What it is
The rigor layer most analytics tools hide. It is the compute service and catalog behind two research wizards — a study-designer and an analysis-wizard — plus a public, browsable rigor catalog that flies the flag: every study design, statistical test, assumption check, remedy, and sampling method the toolbox knows how to run, each grounded in the cited methods canon rather than invented.
The toolbox owns the math directly: t-tests, Welch's correction, Mann-Whitney, normality and homoscedasticity checks, Cohen's d, standardized mean differences and balance — all reference-value tested. It also owns the catalog: the tests, their assumptions, the remedies when an assumption fails, the sampling methods, and a set of design archetypes. Principia supplies the evidence — the priors, the taxonomy, the cited remedy library — but never the computation.
Who it's for
The people analyst who has to defend a finding in a room — to a skeptical executive, a legal reviewer, or a peer who knows the difference between a t-test and a Welch correction. They have the data and the question; what they need is the assurance that the test they ran is the right test, that its assumptions held, and that the procedure traces back to a citable source rather than a habit. Two concrete outcomes: a defensible answer for a specific comparison (which group differs, by how much, with which effect size and which assumption-respecting test), and a public catalog they can point a colleague at to settle a methods argument. It is also a foundation other surfaces stand on — the pay-fairness checks in the comp layer and the small-N enrichment elsewhere only get to claim rigor because this layer owns the math underneath them.
The rigor doctrine, end to end
The killer-demo spine runs the whole doctrine in one call. Given two de-identified samples, it will:
- Check the assumptions — is the data normal enough? are the variances equal?
- Choose the right test automatically — severe non-normality routes to Mann-Whitney; unequal variance routes to Welch; otherwise a pooled t-test.
- Run the chosen test plus an effect size (Cohen's d).
- Show the remedy moment — the difference between the naïve, assumption-blind pooled answer and the applied, assumption-respecting one, with every assumption linked to its catalogued remedy and the canon that grounds it.
That last step is the point: most tools quietly run the default test and report a number. This one shows you what the default would have told you, why it would have been wrong, and what the correct procedure says instead.
The catalog (the public flag)
The same catalog that powers the wizards is published as a filterable, citable reference — study designs, tests, assumptions, remedies, and sampling methods, each with a plain-language summary, a when-to-use and when-not-to-use, and the source it rests on. It is to research methods what the visualization catalog is to charts: the discoverable, honest inventory of what the platform actually knows how to do.
Honesty rail
The selection side — sampling, balance checks, and allocation for the study-designer — is live and reproducible (seeded). The broader design-recommendation engine — pattern-matching a research question to a recommended design from Principia's coded corpus — is a documented follow-up, not yet shipped. ANOVA, regression, and paired tests are on the same roadmap. What ships today is honest about its boundaries; nothing in the catalog is claimed beyond what the compute service can run.
Why it is shaped this way
- Stateless by design. Callers post de-identified numeric vectors; nothing row-level is persisted. Run the anonymizer before any call that carries participant data.
- The math is owned, the evidence is sourced. Computation lives in the toolbox so it never drifts; the citations and priors come from Principia so the rigor claims are grounded, not asserted.
- One catalog, two faces. The wizards and the public rigor catalog read the same source — the reference an analyst browses is the same one the engine consults.
Related capabilities
- Statistical analysis engine — the lower-level compute primitives this layer composes.
- PA Instruments — the measurement building blocks that assume rigorous methods underneath.
- Compensation scenario modeling — a consumer of the pay-fairness checks this layer makes trustworthy.
