peopleanalyst

research / vela / audience tiers

Christianity, sex, and shame — academic peer review

A senior referee's review of the christianity-sex-shame literature review and supporting documents, written for a psychology-of-religion / history-of-Christianity outlet.

By Mike West

Vela·Audience tiers·Peer-review framing·source: people-analyst/vela/docs/research/reviews/christianity-sex-shame-academic-peer-review.md

Peer-Review Memo — Christianity, Sexuality, and Shame

Manuscript: Christianity, Sexuality, and Shame: From Patristic Theology to Empirical Psychology (docs/research/papers/christianity-sex-shame-literature-review.md) — with supporting documents: the literature map (docs/research/christianity-sex-shame-literature-map.md), the three corpus syntheses dated 2026-04-23, and the theological-coherence intervention protocol v0.1 (docs/research/protocols/theological-coherence-intervention-v0.1.md).

Reviewer: Senior referee, psychology of religion / history of Christianity (cross-disciplinary). Outlets considered (assumed): Psychology of Religion and Spirituality (best fit — the manuscript's empirical center of mass), Journal of Sex Research methods track (acceptable; would need empirical-side reweighting), Archives of Sexual Behavior (acceptable but a longer reach), or — for a substantially reframed historical version — Church History or Journal of Religious History. The manuscript currently sits between two literatures and would need to commit to one for placement. Date of review: 2026-05-22. Recommendation in advance: Reject with strong encouragement to resubmit, after the manuscript chooses a venue and is restructured accordingly. The synthesis is genuinely useful as an internal program document. As a publishable contribution it is undercooked in four specific ways enumerated below.


1. Summary of claims

The literature review advances five claims, which I take to be the publishable backbone if any of this is published:

  1. The historical-theological narrative the corpus syntheses construct — Mediterranean ascetic substrate; Pauline pragmatism; Augustinian pivot; medieval canonical elaboration; East/West divergence; modern evangelical purity culture as the latest expression — is supported by the strongest available academic scholarship (Brown 1988, 2000; Harper 2013; Brundage 1987; MacCulloch 2024; O'Donnell 2005; Burrus 2008; Perisanidi 2017; Stan & Turcescu 2010).
  2. The empirical mechanism connecting Christian formation to sexual outcomes is reasonably well-characterised through two pathways: a sex-guilt pathway (Mosher; Woo, Morshedian, & Brotto 2012) producing desire inhibition, and a moral-incongruence pathway (Grubbs & Perry 2019; Grubbs et al. 2022) producing distress independent of behaviour frequency.
  3. The same religious tradition can also operate through a sanctification pathway (Murray-Swank, Pargament, & Mahoney 2005; Hernandez et al. 2011; Leonhardt, Busby, & Willoughby 2020) that predicts better rather than worse sexual outcomes — making religion a double-edged instrument whose direction of effect depends on pastoral formation.
  4. Purity culture is now a measurable research object (Klein 2018 as qualitative foundation; Ortiz et al. 2023 PCBS as psychometric instrument; Sawatsky et al. 2025; Muskrat et al. 2025; Coates et al. 2026) with documented harms across marital, sexual, trauma-interaction, and minority-identity outcomes.
  5. Intervention evidence — whether theological reframing can reduce shame outcomes — is almost entirely absent in the published literature. The closest proxy is the identity-integration work (Anderson & Koc 2020; Etengoff et al. 2024 SMRII). The manuscript identifies five specific research questions (K.1–K.5) that bridge the gap and proposes one of them as a registered RCT (the theological-coherence intervention protocol).

These are useful claims. (1)–(3) are mostly correct as summary statements of the field. (4) is the most empirically interesting and current. (5) is the most ambitious. None of them, in the present form, would survive the version of me that has read every primary source the manuscript cites. The reasons follow.


2. Strengths

  • The cross-validation method is unusual and worth keeping. Producing two independent parallel literature maps from different LLM substrates (browser Claude and ChatGPT Deep Research), then merging conservatively where they disagree, is a defensible bibliographic-discovery procedure for a domain this large. The reconciliation notes at the end of the review are honest about where the two drafts diverged. This is more methodological discipline than most narrative reviews show. If the manuscript is reframed as a methods piece, this is a publishable contribution on its own — see §5.
  • Coordinate-by-coordinate evidence-strength tagging in the literature map is a real contribution. The "Meta-analyzable? yes / partial / no" column and the "Evidence strength" taxonomy (historical narrative / small-N empirical / large-N empirical / meta-analytic / systematic review / psychometric validation / clinical case-series) are the kind of metadata that field reviews systematically fail to surface. A reader can identify in three minutes which findings could feed a quantitative synthesis and which cannot. Most narrative reviews in the psychology of religion do not.
  • The sanctification / sex-guilt double-edged framing is correctly load-bearing. Treating Leonhardt, Busby, & Willoughby (2020) and Hernandez et al. (2011) as central rather than peripheral is the right scholarly move. Most popular treatments of "religion produces sexual shame" omit the sanctification literature entirely; the manuscript does not. Section 3.3 of the literature review is the strongest passage in the document.
  • The K-section research-question framing is operationally serious. Five specific, falsifiable bridge questions, each tied to identified instruments and identified populations, is good program-building. K.3 (East/West empirical comparison) and K.5 (theological-reframing intervention) are the two that would produce genuinely new knowledge if executed. K.5 is operationalised in the v0.1 protocol; the protocol exists, has hypotheses, has a design, and names funding numbers. That is more than most research programs at this stage.
  • The reconciliation note is honest. Page-bottom acknowledgement that several major sources appeared only in one of the two parallel drafts (Woo 2012; Leonhardt 2020; PCBS 2023; Sawatsky 2025; Muskrat 2025; Coates 2026; Rigo & Saroglou 2019; Kaplan 2025; Gordon 2018; Perisanidi 2017) is the kind of disclosure most authors would hide. Keep it; expand it (see §3.2).

3. Major concerns

3.1 The manuscript is a literature review of an LLM-assisted synthesis whose primary-source coverage is partial, and this is not adequately acknowledged

The corpus syntheses (the three 2026-04-23 documents) were produced by retrieval-augmented generation over a ~1,692-passage library. The 2026-05-15 update note at the top of the case study (docs/research/2026-04-23-christianity-sex-hangup.md) candidly reports that six of the most important primary sources for this topic — Pagels, Boswell, Brooten, Harper, Jordan, Dale Martin — were either absent from the original retrieval set or, after ingest, still failed to surface in top-15 retrieval results because their bulk chunks were dominated by bibliography and footnote pollution. Three of those six (Boswell, Brooten, Dale Martin) remain functionally absent from retrieval even after the rerun. The literature review then synthesises against this partial corpus and against secondary academic sources discovered through two parallel LLM-assisted searches.

The manuscript acknowledges in passing (§1, §9 footnote) that the corpus is incomplete. It does not acknowledge that the historical-theological backbone of the synthesis — the part it presents as "supported by the strongest available academic scholarship" — was constructed from a corpus that systematically under-represents Boswell's Christianity, Social Tolerance, and Homosexuality, Brooten on early-Christian women's sexuality, and Dale Martin on Paul. These are not optional sources for the claims being made. They are foundational. The fact that the synthesis arrived at conclusions broadly consistent with Brown, Harper, MacCulloch despite their absence is reassuring but not exonerating; it tells us the synthesis is reproducing the consensus view, which is what one would expect from RAG over consensus secondary sources. It does not tell us whether the consensus is correct.

The required disclosure for any submission is approximately: "Section 2's historical-theological synthesis was produced through retrieval-augmented synthesis against an LLM-readable corpus of N=1,692 passages. The corpus does not include — or includes but does not surface in retrieval for — Boswell (1980), Brooten (1996), and Martin (1995, 2006). The synthesis may therefore reflect the consensus position of the corpus's dominant secondary sources (MacCulloch, Brown, O'Donnell) rather than the field's full disciplinary breadth on questions of same-sex sexuality in early Christianity, the gendered reception of Pauline texts, and the contested interpretation of Romans 1."

Without this disclosure, the review presents an LLM-mediated consensus synthesis as a survey of the field. Those are not the same thing.

3.2 Multiple-testing and citation-coverage asymmetry across the two parallel literature maps is unmeasured

The reconciliation note acknowledges that several major citations appeared in only one of the two parallel drafts. The manuscript adopts the union: every citation that surfaced in either draft is included in the merged map. The implicit assumption is that more coverage is better. This is unsupervised citation accretion.

Two problems follow:

  1. No false-discovery rate for citation suggestions. The literature map contains ~80 citations across sections A–F + K. We do not know — and the manuscript does not investigate — how many citations either draft fabricated, mis-attributed, or assigned the wrong DOI before reconciliation. The manuscript states that DOIs were "verified against publisher pages or CrossRef" and that "high-risk 2023–2026 citations [were] spot-checked via WebFetch/WebSearch prior to commit." This is necessary but not sufficient. The relevant rate is not "did the DOI resolve?" but "does the cited paper actually say what the row claims it says?" The two are routinely different in LLM-suggested bibliographies. A 5–15% claim-level fabrication rate is the field's current empirical estimate (see, e.g., the Walters & Wilder 2023 ChatGPT citation audit) and the manuscript provides no evidence its rate is lower.

  2. Sampling bias toward Anglophone, post-2015, indexable literature. Both parallel LLM drafts will systematically over-represent papers with English abstracts, with stable DOIs, with web-indexable PDFs. The literature map reflects this: §E.7 ("Orthodox Christianity — empirical gap") notes that "Greek and Russian Orthodox populations have not been studied at scale in English-language empirical sexual psychology literature." That sentence may be true; it may also be that Russian and Greek-language work exists and was systematically invisible to the search procedure. The manuscript has no way to distinguish these. Neither do I, which is the point — but the manuscript should.

Required: a hand-validation pass on a random subsample (n ≥ 20) of citations in §§A–F, where the reviewer reads the actual paper and reports per-row whether (a) the citation is correct, (b) the row claim accurately summarises the paper's contribution, (c) the evidence-strength tag is appropriate. Report the per-cell error rate. Without this, the literature map is a hypothesis, not a survey.

3.3 The identification claim around moral incongruence is partially overstated

Sections 3.2 and 9 of the literature review treat moral incongruence as the field's best-supported contemporary mechanism, citing Grubbs and Perry (2019) and Grubbs et al. (2022). The latter is described as "preregistered" and as providing "the highest level of confidence the field currently offers." This is approximately right but contains two slips a careful reviewer will flag.

  1. The registered report demonstrates that moral disapproval predicts perceived addiction-like appraisal of pornography use; it does not demonstrate that moral incongruence causes sexual shame in general. The dependent variable is a self-report of perceived problematic use, and the population is largely college-aged, US, English-speaking, with overrepresentation of LDS and evangelical Protestants in the Grubbs lab's recruitment streams. The leap from "moral disapproval predicts self-reports of compulsive pornography appraisal" to "the doctrine is doing more causal work than the sex" (public introduction §3) is a step the registered report does not license. It is a defensible interpretation; it is not what the registered report tested.
  2. The Jennings et al. (2021) systematic review's finding — "small-to-moderate and inconsistent associations across studies" — is more equivocal than the manuscript's summary treatment suggests. Section 3.2 reads as if moral incongruence is settled; Jennings reads as if it is one of several mechanisms with mixed support. The literature review should state Jennings's actual conclusion verbatim and reconcile it with the Grubbs-program-of-research interpretation it currently leads with.

Required: rewrite §3.2 and §9 second-paragraph to distinguish (a) the registered report's actual finding from (b) the broader interpretive claim about religious-sexual shame mechanisms. Use Jennings's review as the constraint rather than the footnote.

3.4 The historical-theological / empirical-psychology bridge is asserted, not constructed

The manuscript's central methodological move is to position the historical narrative against modern empirical work. The two literatures use different evidentiary standards: history relies on primary texts, contextual reading, and disciplinary peer judgement; empirical psychology relies on standardised measurement, replication, and statistical inference. These are not commensurable. The manuscript treats them as commensurable.

Specifically: when the literature review writes that the Pelagian controversy "drove Augustine to 'extremes of statement'… developing his mature doctrines of concupiscence and original sin in the late 390s and early 400s" (§2.2), and then writes that "the empirical mechanism is now reasonably well characterised" through the Woo–Grubbs pathways (§9), the implicit claim is that the doctrine Augustine developed in 400 CE is the doctrine whose operation is measured by the Mosher Sex Guilt subscale in 2012. This is a heroic conjecture. Between Augustine and the Mosher scale lies the entire history of medieval canon law (Brundage), the Counter-Reformation, Protestant differentiation, the 19th-century purity movements, the 20th-century evangelical socialisation regime, and the institutional infrastructure of modern American religious life. The manuscript names these in §2 but does not show that the variables Mosher (1968) measures are continuous with the doctrinal categories Augustine introduced.

The empirical work the manuscript cites does not, in fact, measure Augustinian inheritance. It measures contemporary religious-formation effects in mostly post-1990 American samples. The historical and empirical literatures are linked in the manuscript, not in the citation chain. This is what the K.1 research question (operationalise the historical lineages as distinct psychological schemas) acknowledges. Until K.1 is executed, the bridge is hypothesised, not built, and the literature review should say so. Currently §9's three conclusions overstate the integration.

3.5 The theological-coherence intervention protocol is not operationalised at clinical-trial standard

I read the protocol v0.1 carefully because it is the manuscript's most actionable downstream artifact. Multiple specifications fall below what a registered RCT in this space would need.

  1. The primary outcome (Mosher Sex Guilt at week 8) is a state-level shame measure used as if it were a trait-level intervention target. Mosher's instrument was developed for cross-sectional differentiation; its test-retest reliability over an 8-week interval in clinically-shifting populations is not well-established. The literature on shame interventions (Tangney & Dearing 2002; Gilbert 2010 on compassion-focused therapy) suggests that 8 weeks is the lower end of detectable change. The protocol should report the expected Cohen's d under a no-intervention assumption and justify the choice of Mosher vs. a more change-sensitive instrument (Brief State Shame Scale; Shame Aversion Scale).
  2. The active comparison ("CBT-based sex therapy") is not specified at protocol level. What CBT protocol? Brotto's mindfulness-based sex therapy is named in passing; mindfulness-based interventions are not CBT. Manualisation of the comparison condition is required for the comparison to mean anything. If both arms produce d = 0.6, you cannot conclude that theological framing is the active ingredient; you can only conclude that structured therapeutic attention produces an effect. The protocol's success criterion ("baseline → T1 Cohen's d ≥ 0.40 on Sex Guilt in intervention arm") does not address this — the intervention arm could meet that threshold while the comparison arm exceeds it. The pre-registered primary analysis must specify intervention minus comparison, not just intervention pre-post.
  3. The three tradition-stratified arms (A evangelical, B Catholic/Orthodox, C queer-affirming) are not statistically well-formed as a single trial. This is functionally three trials, each underpowered. With n = 30 per arm in the pilot and n = 80–120 per arm in the full study, the per-arm comparison-to-active-control has acceptable power for d ≥ 0.5 but the between-arm contrast (the H5 question — does intervention magnitude differ by tradition?) requires substantially larger samples, probably n ≥ 200 per arm. The protocol's H5 is descriptive ("open, descriptive"), which is appropriate, but the manuscript should be explicit that any tradition-by-intervention interaction estimates are exploratory and should not appear in the abstract.
  4. The randomisation is "stratified randomisation within each arm" — meaning each tradition arm randomises its own participants between intervention and CBT comparison. There is no cross-tradition randomisation. This is the right design for a tradition-stratified study but it means the trial cannot test whether the intervention works "across Christianity" — only whether it works within each tradition separately. The literature review's framing (§9, last paragraph: "the design and evaluation of theological reinterpretation as a clinical intervention remains the most important undone work") is broader than what the protocol can deliver. Tighten the language.
  5. The IRB pathway and pre-registration are listed as "in progress" / "pending." A protocol that has not cleared IRB and has not been registered on OSF is a draft, not a registered RCT. The literature review's citation of "the theological-coherence intervention protocol" as evidence that the bridge work is in progress would be more accurately described as designed but not yet initiated.
  6. Theological non-coercion ("the intervention is not covert proselytization in any direction") is asserted but not procedurally enforced. The intervention arms install specific theological content (Bolz-Weber, Coakley, Ruether) selected by the research team. Participants randomised to the intervention arm receive that content; participants in the comparison arm do not. From the participant's standpoint this is not non-coercive: the trial is testing whether a particular theological reframing reduces shame, and the trial's success is defined as that reframing being effective. The consent process should disclose this with the specificity the protocol does not yet contain.

Required: before this protocol can carry the manuscript's claim that intervention evidence is being generated, it needs (a) manualised comparison arm, (b) clearer specification of the H1 contrast as between-arm not pre-post, (c) IRB clearance, (d) OSF pre-registration, (e) consent language that explicitly describes the theological direction of the intervention content. Treat the current v0.1 as a discussion-with-pastors draft, which is what it self-describes as, and reflect that in the literature review's language.

3.6 Preregistration coverage of the broader research program is uneven

The K.1–K.5 research questions are described as a "research program." Of the five, only K.5 has a draft protocol. K.1 (operationalising historical schemas as psychological measures) requires a scale-development program that is not started. K.2 (partitioning doctrine vs. institution vs. pedagogy) requires a mixed-methods study that is not specified. K.3 (East/West empirical comparison) requires recruitment infrastructure in Orthodox populations that does not exist. K.4 (psychophysiological pathway) requires SIS/SES + arousal measures the Vela platform does not currently administer.

This is not a fatal problem — research programs are aspirational by nature — but the manuscript currently presents the five questions with parallel rhetorical weight, as if each is equally close to execution. Only K.5 is. The other four are essentially editorialised wishlist items. Tier them: protocol-stage (K.5), study-design-stage (K.3), conceptual-stage (K.1, K.2, K.4). This honesty would not damage the contribution; it would tell readers what the program actually delivers.

3.7 Sample-population dependencies are under-discussed

Several of the manuscript's load-bearing empirical claims rely on a narrow band of samples:

  • The Grubbs program of research is heavily LDS- and evangelical-Protestant-weighted in its recruitment streams. Generalising to Catholic, Orthodox, or non-Christian populations requires the cross-tradition work (Rosmarin & Pirutinsky 2019; Kaplan et al. 2025; Rigo & Saroglou 2019) which the manuscript cites but does not weight against the LDS/evangelical sample dependency.
  • The Coates et al. (2026) finding — childhood purity-culture exposure independently predicts sexual shame among NSE survivors — has N = 85, including only 30 CSA survivors. The literature review calls this "what may be the most clinically consequential finding in the literature." It is also a finding from a single small sample. The strength-of-evidence framing should acknowledge this.
  • Sawatsky et al. (2025) is a snowball sample of N = 5,489 white Christian women recruited through online channels. The N is impressive; the sampling frame is not population-representative. The literature review reports the result without the recruitment caveat. A reviewer would catch this.

Required: add a sampling-frame note for each load-bearing empirical claim. The Methodology section of the OVERVIEW doc would benefit from this even if the literature review does not.

3.8 Effect-size translation is absent

The literature review repeatedly cites associations and effect sizes (e.g., Leonhardt et al. 2020 N = 1,614; Coates et al. 2026 p = 0.00083) without converting them into outcomes a clinically-oriented reader can interpret. What does a one-SD increase in purity-culture beliefs translate to in terms of marital-satisfaction units? What is the absolute difference in sexual-pain prevalence between high-PCBS and low-PCBS women? The empirical literature provides this information; the literature review elides it.

For a Vela-internal program document this is acceptable. For peer-review publication this is not. The standard the Grubbs lab and the Coates–Meston lab use in their primary papers should be matched.

3.9 No discussion of publication bias in the purity-culture and reclamation literatures

The purity-culture literature has, plausibly, the strongest publication-bias problem of any subfield the review covers. The qualitative foundation (Klein 2018) is auto-ethnographic-adjacent — the author is herself a deconverted purity-culture survivor. The Ortiz et al. PCBS development team includes scholars who have publicly identified as critics of purity culture. The Sawatsky et al. (2025) team includes the Gregoires, who run a prominent post-purity-culture deconstruction platform. None of this invalidates the research; it does mean that researcher allegiance effects are non-trivial in this corpus and the manuscript should say so.

Conversely, the sanctification literature is concentrated in Pargament's laboratory and the BYU-Provo network, both of which are mainline-LDS-and-Protestant in confessional orientation. The reclamation literature is overwhelmingly written by reclamation advocates. The literature on sexual shame and Christianity is not a literature with the dispersed researcher allegiance that, say, the obesity literature is. The manuscript should declare this.

3.10 The "open research questions" section identifies five gaps but does not address the most obvious one

The most important question this literature cannot answer — and which the manuscript does not pose as a K-question — is whether sexual shame in religiously-formed individuals is theologically caused at all, or whether it is socially caused (peer/family/community surveillance) with theology providing the post-hoc vocabulary. K.2 gestures at this ("which matters more: doctrine, institution, or pedagogy") but does not commit to the harder version: suppose theology is doing none of the work, and the same socialisation regime would produce the same shame profile under any moral vocabulary. The implicit counterfactual is unaddressed.

This matters because the theological-coherence intervention (K.5) tests whether changing the theological framing reduces shame. If the theology is not the proximate cause — if it is institutional regulation, family dynamics, peer-witnessing structure that does the work — then the intervention's effect size will be small or null even if every other design specification is correct, and the manuscript's central program-of-research bet will not pay off. A registered prediction here, even a directional one, would substantially improve the program's auditability.


4. Minor issues / presentation

  • Header consistency. The literature review has a YAML header; the corpus syntheses do not; the literature map has a paragraph header. A submitted manuscript needs a single canonical header format with date, version, and corresponding author. Currently the manuscript's metadata lives in three different shapes.
  • Reference list is not deduplicated against in-text APA cites. I spot-checked and counted at least two references cited in-text (Pagels 1988; Meyendorff 1979) that appear in the bibliography but are not cited in the main text body of the literature review. Conversely, the public introduction names "a clinical researcher named Tina Schermer Sellers" without a parenthetical year; the bibliography has Sellers (2017). Format consistency is required.
  • The "v0.1 — awaiting reconciliation + editorial pass" status flag in the literature review header reads as honest but is not appropriate for submission. Either ship as draft-for-internal-review (which is what this currently is) and remove the venue-aspirational framing, or commit to an editorial pass and remove the flag.
  • Section 4 (Purity Culture) and Section 7 (Theological Alternatives and Reclamation Literature) repeat material from the sanctification subsection (3.3) without explicit cross-reference. A reader processes the sanctification finding three times. Consolidate or cross-reference.
  • The public introduction's narrative voice does not match the literature review's voice. This is fine — they target different audiences — but the manuscript should make this explicit. Currently a reader who reads both back-to-back may be confused about which is the canonical document. The literature review header refers to the public introduction as a "general-audience companion"; the public introduction does not return the reference.
  • The Burrus (2008) complication is invoked once in §2.1 and then dropped. Burrus's argument — that shame did not become guilt but was transformed and re-grounded — is the most important complication of Harper's shame-to-sin thesis, which the manuscript treats as the central historical claim. The complication deserves a full paragraph and a return to it in §9. Currently it is acknowledged and then ignored.
  • East/West coverage is asymmetric in a way the manuscript does not own. The historical-theological side covers the East thinly (Stan & Turcescu 2010; Perisanidi 2017; Meyendorff 1979) compared to its coverage of the West. The empirical side covers the East not at all (§E.7). Both gaps are flagged. Neither is integrated into the conclusions, which read as if the East/West story is symmetric. It is not.
  • The protocol's funding numbers should not appear in the literature review. They are appropriate to the protocol document itself but undermine the literature review's framing as scholarly synthesis. Move all reference to dollar amounts and timelines to the protocol; cite the protocol from the review.

5. Specific recommendations

The single biggest decision: which paper is this? The current document is three papers in one envelope. Pick one.

  • Option A (literature review, psychology-of-religion venue). Strip the historical-theological backbone to a tight 1,500-word context section. Lead with the empirical synthesis (§3 + §4). Foreground the moral-incongruence / sanctification double-pathway as the central interpretive claim. Replace the K-section with a single 2,000-word "directions for empirical research" section that prioritises K.5 (the intervention) and downgrades K.1–K.4 to a research-program appendix. Submit to Psychology of Religion and Spirituality. This is the best fit and the highest probability of placement.
  • Option B (historical-theological review, history-of-Christianity venue). Strip the empirical psychology. Foreground the Augustine-as-distortion and post-Augustinian-elaboration claims as the central historical contribution. Engage with Boswell, Brooten, and Dale Martin explicitly — which the current draft does not. Submit to Journal of Religious History or Church History. Requires ingest of the missing primary sources; not a six-week revision.
  • Option C (methods paper on LLM-assisted historiographic synthesis). Reframe the entire document as a case study in cross-validated dual-LLM literature mapping for under-resourced bibliographic domains. The reconciliation method between the two parallel drafts is the contribution. Submit to Quantitative Science Studies or a digital-humanities venue. The §3.1 and §3.2 concerns become the substance of the paper.

Each option is a publishable paper. The current draft is none of them, because each part is built to a different standard.

Other recommendations, regardless of which option is chosen:

  1. Hand-validation pass on the literature map. Sample n ≥ 20 rows. Read the cited papers. Report per-row whether the citation is correct and the claim is accurate. Report the per-cell error rate in the published manuscript.
  2. Distinguish corpus-internal claims from external-literature claims. Currently the synthesis treats both as evidentially equivalent. They are not. Corpus passages have RAG-resolution-level provenance; external citations are author-claimed.
  3. Add an explicit limitations section discussing (a) LLM-corpus partiality, (b) Anglophone/post-2015/indexable sampling bias, (c) researcher-allegiance effects in purity-culture and reclamation literatures, (d) the historical-empirical bridge as hypothesised rather than constructed.
  4. Strip the intervention protocol from the literature review. Cite it; do not include it. The two documents serve different functions.
  5. Tier the K-questions. Currently they have parallel weight. Only K.5 is protocol-stage.
  6. Engage Boswell, Brooten, and Dale Martin on the same-sex sexuality / gender-of-reception questions before any historical-theological submission.
  7. Specify the publication path before submission. The current document is venue-undecided, which is visible in the writing — the empirical sections are written to a Journal of Sex Research register and the historical sections to a Church History register, and the registers do not blend.

6. Verdict

Reject in present form, with strong encouragement to resubmit a venue-committed restructure.

The literature review is a genuinely useful internal program document and a credible synthesis of a field that very few researchers have the patience to integrate. It is also undercooked as a publication. The single most consequential weakness is the implicit assumption that the historical-theological narrative and the modern empirical literature are evidentiarily commensurable; they are not, and the manuscript's central methodological move — treating Augustine's 5th-century doctrinal categories as continuous with Mosher's 1968 instrument — is a conjecture, not a finding. The K.1 research question acknowledges this. The body of the manuscript does not.

The intervention protocol is the manuscript's most actionable artifact and the part most worth investment. As currently specified it would not meet the standard of any of the venues the protocol's "Deliverables" section names. The protocol authors should treat the §3.5 list as an action checklist before any IRB submission, not as reviewer pedantry.

If forced to pick the best single direction: option A (psychology-of-religion venue, empirical-led restructure) is the highest-probability placement. Option C (methods paper) is the most novel contribution. Option B (history-of-Christianity venue) is the deepest scholarship if executed properly but requires substantial primary-source work first.

The dataset of citations is a real asset. The synthesis is interesting. The bridge claim is the part that needs work.


Reviewer signed off, 2026-05-22. Conflicts of interest: none declared. The reviewer notes the manuscript was produced with LLM assistance and recommends the corresponding author add an AI-disclosure statement per APA Publication Manual 7th edition / Council of Science Editors guidance, including (a) which sections were LLM-drafted, (b) which sections were LLM-discovered (citation suggestions), and (c) what human-verification pass was applied. The reconciliation note is a start; it should become a Methods subsection in any submitted version.