peopleanalyst

research / principia / preregistrations & protocols

Preregistration scaffold — synthesis-analytic protocol

Nine-section template for synthesis-analytic preregistrations filed when construct-family surveys surface meta-analytic gaps. Concrete preregistrations land alongside the engagement survey if a meta-analytic gap appears. Inherits structure from Vela study-01.md.

By Mike West

Principia·Preregistrations & protocols·source: people-analyst/principia/docs/research/preregistrations/synthesis-analytic-protocol.md

Preregistration scaffold — synthesis-analytic protocol

Protocol identifier: synthesis-analytic-protocol Repository: people-analyst/principia Registration type: OSF-style internal preregistration (upload-ready), specialized for synthesis-analytic preregistrations filed against the Principia construct-family registry Version: 1.0 — 2026-05-20 Engineering cross-references: docs/research/methodology.md §2 (source-grading rubric), §3 (effect-size column shapes), §6 (snapshot versioning); docs/specification/loop/curator-policy.md D4 (proposals-only — preregistration is curator-driven, not agent-driven); @people-analyst/measurement-core (EffectSize, CanonicalPrior, StudyQualityGrade)


How to use this scaffold

This document is a template. When a construct-family survey surfaces a meta-analytic gap — a tuple (from_construct, predicate, to_construct) where the existing meta-analyses do not answer the question Principia needs answered — copy this file to docs/research/preregistrations/<slug>.md, replace every [FILL: ...] placeholder with study-specific content, file the preregistration commit before extraction begins, and link the preregistration ID into the resulting CanonicalPrior row's preregistration_id field.

Per curator-policy.md D4, the act of filing a synthesis-analytic preregistration is curator-driven. Agents may propose synthesis tasks against thin priors (PRN-024 gap detector → PRN-025 ResearchTask) and may stage draft extractions in data/research/proposals/, but no agent path may file this preregistration unattended, and no agent path may write EffectSize or set citation extraction_status: verified on the basis of a preregistration alone. The preregistration is the lock; the curator turns the key.

The nine sections below are all required. If a section does not apply to a given synthesis, write the reason in place; do not delete the heading.


1. Question being answered

Tuple (Principia canonical form):

(from_construct_id: [FILL: principia:construct:<id>], predicate: [FILL: predicts | covaries_with | moderates | mediates], to_construct_id: [FILL: principia:construct:<id>])

Plain-English question: [FILL: one sentence, present tense, of the form "What is the population effect of X on Y across the post- peer-reviewed literature?" — name the constructs by their canonical Principia names, not by instrument-specific operationalizations.]

Why this question, and why now: [FILL: 2–4 sentences. Name the construct-family survey that surfaced the gap (e.g., "PRN-005 engagement survey, v1, snapshot 2026-MM-DD"). Name the specific reader the answer serves — practitioner deciding between two instruments, researcher needing a defensible prior for a Bayesian model, or registry-internal use as a CanonicalPrior seed for downstream consumers.]

Pre-registered claim shape: [FILL: name the parameter being estimated — typically a pooled correlation $\bar{r}$ or pooled standardized mean difference $\bar{d}$ with between-study heterogeneity $\tau^2$ — and the range of values that would constitute a substantive update to the registry's current prior, if any. The point is to write down, before looking at the data, what counts as a finding versus what counts as a null.]

The question is answered through synthesis of primary studies that already exist; Principia does not run new primary data collection under this scaffold. Primary-data studies file their own preregistrations (see Vela's docs/research/preregistrations/study-01.md for the Vela-side template).


2. Meta-analytic gap motivating the synthesis

Every synthesis-analytic preregistration must name what existing meta-analyses already cover and what they do not, in enough detail that a reviewer can verify the gap is real rather than a missed citation.

Existing meta-analyses surveyed (with grades per methodology.md §2):

Citation (DOI)Yeark studiesTuple coveredReported pooled effectQuality gradeWhy this synthesis does not close the gap
[FILL: Author et al., DOI][FILL: YYYY][FILL: k][FILL: from → to][FILL: r/d, CI][FILL: A/B/C/D][FILL: e.g., last search 20XX, no longitudinal designs, no cross-cultural sample, restricted to a single instrument family]
[FILL: …add rows]

The gap, named: [FILL: 2–4 sentences. Be specific: "No meta-analysis published since YYYY covers tuple (X, predicts, Y) with the inclusion criteria below. The most recent ([Author, year, DOI], grade B) restricted to <population / methodology / instrument family>; the present synthesis broadens that scope to ."]

What this synthesis adds: [FILL: one or two specific items. Examples: post-2020 longitudinal evidence that the prior meta-analyses could not include; cross-cultural moderators not previously tested; effect-size rows linked to a specific instrument generation (e.g., UWES-9 only, separating UWES-9 from UWES-17 mixing); separation of supervisor-rated from self-rated outcomes.]

What this synthesis does not add: [FILL: explicit non-claims. Examples: this synthesis does not estimate causal effects; this synthesis does not adjudicate between competing theoretical models; this synthesis does not include unpublished gray literature beyond the explicit grey-lit channels named in §3.]

Where the existing meta-analyses are A-graded and recent, do not re-run the synthesis. File a ResearchTask of type find_methodology_gap instead, and let the curator decide whether a narrower question (e.g., a single under-tested moderator) is worth a fresh synthesis. The default presumption is that a well-graded, recent meta-analysis is sufficient — Principia's job is to surface and cite it, not to duplicate it.


3. Inclusion / exclusion criteria for primary studies

Inclusion criteria are written before searching. They are the lock against post-hoc filtering.

3.1 Search strategy

Databases (specify all): [FILL: e.g., PsycINFO, Web of Science core collection, Scopus, Google Scholar via SerpAPI per PRN-017, CrossRef via @people-analyst/literature, ProQuest Dissertations & Theses A&I for grey-lit channel.]

Search dates: [FILL: start date — end date. Synthesis-analytic preregistrations lock the end date; re-runs at a later snapshot require an amendment per §9.]

Search strings: [FILL: full Boolean strings per database, copy-paste reproducible. Include controlled-vocabulary terms (MeSH, APA Thesaurus) where the database supports them.]

Forward / backward citation chasing: [FILL: yes/no, with specific seed-set citations (DOIs).]

Hand-searched journals / sources: [FILL: list, or "none".]

3.2 Inclusion criteria (all must be satisfied)

  • Construct operationalization. The study must measure both constructs in the tuple using an instrument Principia recognizes as a member of the corresponding Instrument family (see registry rows under principia:instrument:*). [FILL: list the specific instruments that qualify, e.g., for engagement: UWES-9, UWES-17, JES, MEI, Gallup Q12, ISA Engagement Scale.]
  • Study design. [FILL: e.g., quantitative; cross-sectional, longitudinal, or experimental; must report a usable effect-size statistic per §5.]
  • Population. [FILL: e.g., working adults in employment; minimum sample age; exclusions for clinical populations if relevant.]
  • Language. [FILL: e.g., English only / English + Spanish / no language restriction with translation budget.]
  • Year range. [FILL: e.g., 2000–[search end date]; or whatever bounded window matches the gap in §2.]
  • Peer review status. [FILL: published in peer-reviewed venue, with specific handling for dissertations and conference proceedings — typically grade-C/D by methodology.md §2 unless the dissertation was later published.]
  • Statistical reporting sufficiency. The study must report enough to compute the effect size in §5 (point estimate plus one of: standard error, confidence interval, t/F statistic with df, or raw means + SDs + cell sizes).

3.3 Exclusion criteria

  • Studies that report only composite scores spanning both tuple constructs (confounded predictor/outcome operationalization).
  • Studies whose instrument falls outside the family-recognized list in §3.2.
  • Studies with n below [FILL: threshold — typical default n ≥ 30 per cell for correlation, n ≥ 50 per arm for between-group designs].
  • Studies that have been retracted, expressed-concerned, or for which a corrigendum changes the reported effect size beyond the precision required for inclusion.
  • [FILL: study-specific exclusions, e.g., student samples for adult-engagement work; samples drawn from organizations the author had a financial relationship with.]

3.4 Handling of independence

[FILL: how multiple effect sizes from the same sample are handled. Default: one effect size per independent sample per tuple; when a study reports multiple, take the most fully-adjusted estimate, or aggregate within-study via the methods named in §5.4. Decision is locked at preregistration; departures logged in §9.]


4. Coding protocol

Two trained coders extract every study independently. Coding decisions are reconciled through a documented process; the inter-rater reliability target is set in advance.

Coding form fields (the per-study extraction row):

The coding form mirrors the EffectSize row shape in methodology.md §3 plus a moderator block. Required fields:

  • study_id (Principia internal, assigned at coding time)
  • publication — full citation; DOI required if available
  • construct_a / construct_b (principia:construct:<id> canonical resolution)
  • instrument_a / instrument_b (principia:instrument:<id> if registry-known; verbatim name otherwise)
  • effect_type (r / d / β / OR / η² / f² / other)
  • effect_value, ci_lower, ci_upper, p_value, n
  • study_design, population, country_or_region, language
  • quality_grade (A / B / C / D per methodology.md §2, with one-sentence rationale)
  • extraction_notes (load-bearing covariates; analytic adjustments; selection issues)
  • Moderator fields (pre-specified): [FILL: list moderators that will be coded for heterogeneity analysis in §6. Common: publication year; design (cross-sectional vs longitudinal); country WEIRD/non-WEIRD; reporter (self / supervisor / peer); industry sector. Adding a moderator after coding begins requires a §9 amendment.]

Coder training: [FILL: number of pilot studies coded jointly before independent coding begins. Default: 5 studies, with disagreements walked through case-by-case until coders converge.]

Inter-rater reliability target:

  • Categorical fields (design, population type, quality grade, instrument identification): Cohen's κ ≥ 0.80 on the first 20 independently-coded studies. Below 0.80 triggers a re-training cycle and re-coding of the disagreement set.
  • Numeric fields (effect-size values, n, CI bounds): exact agreement ≥ 95% within rounding tolerance ($\pm 0.01$ for correlations; $\pm 0.05$ for d). Disagreements adjudicated against the source paper.

Disagreement reconciliation: Where coders disagree after independent extraction, both flag the disagreement; the curator (or a third coder where the curator is one of the original two) adjudicates against the source paper. Adjudication notes are preserved on the row as extraction_notes. No effect-size row is promoted to the registry until coder agreement is recorded. Per curator-policy.md D4, the agent path may stage extractions in data/research/proposals/; promotion is curator-only.


5. Effect-size extraction protocol

5.1 Target effect-size metric

[FILL: typically Pearson's $r$ for construct–construct relationships; standardized mean difference $d$ (Hedges' $g$ correction) for between-group designs; log-odds-ratio for binary outcomes. State the target metric and the transformation rules below.]

5.2 Conversion rules

When a study reports the effect in a non-target metric, convert per the following deterministic rules (lock these at preregistration; do not switch mid-analysis):

  • $t$ / $F$ statistic → $r$: $r = \sqrt{t^2 / (t^2 + df)}$ ; $F(1, df) \to r = \sqrt{F / (F + df_2)}$.
  • $d$ → $r$: $r = d / \sqrt{d^2 + 4}$ (assuming equal cell sizes; use the unequal-cell formula where group n's are known).
  • Odds ratio → $d$ → $r$: Cox's transformation $d = \ln(\text{OR}) / 1.65$, then convert to $r$.
  • $\eta^2$ → $r$: $r = \sqrt{\eta^2}$.
  • Standardized regression $\beta$ → $r$: accept only when $\beta$ is the bivariate standardized estimate; do not back-convert multivariate $\beta$s without the full correlation matrix.
  • Fisher's $z$ transformation applied to $r$ before pooling: $z = 0.5 \ln((1+r)/(1-r))$; back-transform pooled $z$ to $r$ for reporting. This matches the PRN-021 prior-synthesis engine.

[FILL: any study-family-specific extensions, e.g., handling of multi-level effect sizes; partial vs zero-order correlations; semi-partial corrections.]

5.3 Handling missing precision

When a study reports a point estimate without a usable variance estimate (no CI, no SE, no t/F):

  • If $n$ is reported, impute the standard error under the null hypothesis using the standard formulas ($\text{SE}(z) = 1/\sqrt{n-3}$ for Fisher's-z-transformed correlations).
  • If $n$ is not reported either, exclude the study from pooled estimates but retain the row in the registry as extraction_status: noted_for_provenance, with a note explaining the exclusion. This serves §7 publication-bias diagnostics by preserving knowledge of the study's existence even when it cannot enter the pool.

5.4 Within-study aggregation

When a single sample contributes multiple effect sizes for the tuple (e.g., engagement measured via UWES-9 and JES on the same employees, both correlated with the same performance outcome):

  • Default: aggregate via Olkin & Gleser (2009) variance-covariance weighting if the within-study inter-correlation is reported.
  • Fallback (no within-study correlation reported): use the mean of the within-study effect sizes with an inflated variance per [FILL: e.g., Borenstein et al. 2009 §24.4].
  • Sensitivity: re-run pooling using each within-study effect size independently, as a §8 sensitivity analysis.

5.5 Provenance preservation

Every extracted effect-size row links back to:

  • the source Citation row (by DOI or registry citation_id)
  • the extraction PDF or full-text URL (stored in the registry's verification log per methodology.md §7)
  • the coder pair and the adjudication note (if any)

Per methodology.md §7, every extracted row is checked against the source paper before promotion. AI-assisted extraction is allowed; AI-only promotion is not.


6. Heterogeneity-analysis plan

6.1 Heterogeneity statistics

Report all three of the following for every pooled effect:

  • $Q$ statistic (Cochran's Q) with degrees of freedom and p-value.
  • $I^2$ (proportion of total variability attributable to heterogeneity rather than sampling error), with 95% CI.
  • $\tau^2$ (between-study variance), with 95% CI; this is the parameter the PRN-021 prior-synthesis engine consumes for downstream Bayesian use.

6.2 Estimator choice

Primary model: random-effects meta-analysis with DerSimonian–Laird $\tau^2$ estimation (matches the PRN-021 default).

Sensitivity: re-fit with REML estimation and with Hartung–Knapp–Sidik–Jonkman confidence-interval adjustment; report whether the substantive conclusion changes. Departures logged in §9.

6.3 Pre-specified moderators

Moderators are decided at preregistration time; the moderator analyses below run regardless of the pooled-effect result. Post-hoc moderators are not added without a §9 amendment.

[FILL: list pre-specified moderators with a one-sentence theoretical justification each. Examples:

  • Study design (cross-sectional vs longitudinal) — longitudinal designs are presumed to estimate a smaller effect because they remove method-shared variance; if so, the pooled cross-sectional estimate is upward-biased.
  • Reporter source (self-report vs supervisor-rated outcome) — common-method variance inflates self–self correlations; the moderator analysis quantifies the inflation.
  • Publication year (pre/post a date) — tests whether the effect has drifted across the literature.
  • WEIRD vs non-WEIRD sample — cross-cultural generalizability check.]

6.4 Subgroup vs meta-regression

For each pre-specified moderator:

  • Categorical moderators with ≥ 2 levels: subgroup analysis with between-subgroup $Q_B$ test; report subgroup-specific pooled estimates.
  • Continuous moderators (e.g., publication year, mean sample age, % female): meta-regression with $\tau^2$ from the random-effects model.

Subgroups with $k < 4$ studies are reported descriptively, not pooled — this matches the PRN-021 engine's $k=1$ fixed-effects-fallback boundary and prevents single-study subgroups from masquerading as pooled estimates.


7. Publication-bias diagnostics

Publication bias is the largest single threat to validity in synthesis-analytic work, and the diagnostics chosen at preregistration commit to which threats the synthesis takes seriously.

7.1 Funnel-plot inspection

Visual inspection of the funnel plot (effect size vs precision, typically standard error or $1/\sqrt{n}$) is reported in the manuscript. Visual asymmetry is described, not interpreted as evidence on its own. Quantitative tests follow.

7.2 Egger's regression test

Egger's regression of the standardized effect on its precision. Report the intercept, the 95% CI, and the p-value. Significant intercept ($p < 0.10$, the conventional Egger threshold) flags asymmetry consistent with — but not diagnostic of — small-study effects or publication bias.

7.3 Trim-and-fill

Duval & Tweedie's trim-and-fill procedure. Report the number of imputed studies and the adjusted pooled estimate. Caveat: trim-and-fill assumes the missing-studies mechanism is symmetric in effect-size space; it is an exploratory diagnostic, not a corrective.

7.4 PET-PEESE

The Stanley & Doucouliagos PET-PEESE procedure. Report both the PET intercept (test for evidence of any effect after correcting for small-study bias) and the PEESE-adjusted pooled estimate. PET-PEESE is most informative when $k \geq 20$; below that, report it descriptively only.

7.5 Selection models

[FILL: optional — Vevea & Hedges (1995) weight-function selection model or a $p$-curve / $p$-uniform* analysis if the literature is suspected to be heavily affected by selective reporting. Default: include if $k \geq 20$, otherwise skip.]

7.6 Grey-lit asymmetry check

Compare the pooled effect from peer-reviewed-only sources to the pooled effect including grey-lit channels (dissertations, conference papers, technical reports). A meaningful gap is suggestive — not diagnostic — of selective publication. Report regardless of direction.


8. Sensitivity-analysis plan

Sensitivity analyses test whether the substantive conclusion depends on a specific analytic choice. Each sensitivity analysis is pre-registered; sensitivity analyses added after seeing the data are logged in §9.

8.1 Leave-one-out

Re-pool with each study removed in turn. Report the range of pooled estimates and flag any study whose removal moves the pooled estimate by more than [FILL: threshold, e.g., 20% of the pooled value or 0.05 in $r$ units].

8.2 Quality-weighted vs unweighted pooling

The PRN-021 engine quality-weights studies by methodology.md §2 grade (A=1.0, B=0.7, C=0.4, D=0.1). Re-run pooling unweighted and report both estimates. Substantial divergence indicates the quality-grade rubric is doing load-bearing work, which is informative even when the rubric is correct.

8.3 Random-effects estimator alternative

Re-fit with REML and with HKSJ-adjusted intervals (see §6.2). Report whether the conclusion is robust.

8.4 Within-study aggregation alternative

Re-run with each within-study effect size treated independently rather than aggregated per §5.4. Report whether dependency-naive pooling changes the substantive conclusion.

8.5 Instrument-restricted re-pool

Re-pool restricting to a single instrument family (e.g., UWES-only for engagement). Wide divergence between instrument-restricted and full-pool estimates flags an instrument-generation confound and triggers a deeper investigation in the construct-family survey, not a re-estimation here.

8.6 Pre/post-cutoff re-pool

Re-pool restricting to studies published since [FILL: cutoff, often the search-end date of the most recent A/B-graded meta-analysis cited in §2]. The post-cutoff estimate is the synthesis's primary contribution against existing meta-analyses; the pre-cutoff estimate is the consistency check.

8.7 [FILL: study-specific sensitivities]

[FILL: any sensitivities specific to this synthesis. Examples: re-pool excluding studies with author overlap; re-pool restricting to longitudinal designs; re-pool with imputed missing-variance studies excluded.]


9. Deviations log

All deviations from this preregistration after filing are recorded here, dated, signed by the curator, and before the affected analysis is re-run. The deviations-log section is preserved through the lifetime of the synthesis even after publication — a synthesis with zero deviations is suspect; a synthesis whose deviations are documented is credible.

Deviation entry format:

### [Deviation N — YYYY-MM-DD]

**Section affected:** [e.g., §3.2 inclusion criteria; §6.3 moderators; §8.1 leave-one-out threshold]

**Original specification:** [verbatim from preregistration]

**Revised specification:** [verbatim of the change]

**Reason:** [why the change was necessary; data-driven reasons must say so explicitly]

**Effect on conclusions:** [whether the substantive conclusion changes; if yes, by how much]

**Signed:** [curator name; ISO date]

Severity classification (curator-applied):

  • Minor — typographical, scoping clarifications, search-string typos discovered before coding begins. Documented but does not impair the lock.
  • Moderate — inclusion-criterion adjustment, moderator addition, sensitivity addition. Documented; the affected analyses are flagged in the manuscript as "added post-hoc per preregistration deviation log §9 entry [N]".
  • Major — change to the primary effect-size metric, change to the random-effects vs fixed-effects choice, change to the heterogeneity estimator. Triggers a re-filing of the preregistration as a new version (v2), with the original v1 preserved unchanged. The published synthesis cites both versions.

Default content (delete when the first deviation lands):

No deviations as of [FILL: preregistration filing date]. This section is updated in place as deviations occur. The preregistration filing commit is the lock; this section is the running log.


Cross-references

  • docs/research/methodology.md — source-grading rubric (§2); effect-size column shapes (§3); construct-family ordering (§4); schema discipline against @people-analyst/measurement-core (§5); snapshot versioning (§6); novelty verification (§7).
  • docs/specification/loop/curator-policy.mdD4 (proposals-only — preregistration is curator-driven, not agent-driven); D1 (digest storage — synthesis-analytic gaps surface through curator digests); D3 (rejected status for studies excluded after coding).
  • docs/specification/loop/research-ingestion-vision.md — Path 1 (registry-initiated research) is the channel through which synthesis-analytic gaps reach the curator's queue.
  • @people-analyst/measurement-core — canonical EffectSize, CanonicalPrior, StudyQualityGrade types consumed by this scaffold.
  • PRN-021 — Bayesian prior-synthesis engine that consumes the effect-size rows this preregistration produces.
  • PRN-024 / PRN-025 — gap detector and ResearchTask workflow that surface synthesis-analytic candidates to the curator before this preregistration is filed.

OSF upload checklist (when the synthesis is filed publicly)

  • Create OSF project "Principia synthesis-analytic — [tuple slug]"
  • Upload this filled markdown as a component
  • Link the frozen Principia repo commit SHA after the preregistration commit lands
  • Register DOI before any effect-size extraction begins (the lock binds at OSF registration; pre-OSF commits are drafts)
  • Link the OSF DOI back into the Principia registry on the CanonicalPrior.preregistration_id field once the synthesis is run

The first concrete preregistration filed under this scaffold lands alongside PRN-005 (engagement construct-family survey) if the engagement-survey work surfaces a meta-analytic gap the existing literature does not close. The scaffold is canonical; the per-synthesis filings inherit from it without restating the rationale.