library / lib081f830c238dd10e
Exploratory Factor Analysis (Understanding Statistics)
In a sentence
A practical, formula-light, step-by-step guide to conducting exploratory factor analysis (EFA) in SPSS using evidence-based best practices.
Exploratory factor analysis is over a century old and ubiquitous across the behavioral, medical, and social sciences, yet surveys repeatedly show it is routinely misapplied because researchers receive little formal training and lean on poor software defaults. Marley Watkins answers this gap with a concise, accessible, applied manual that walks the reader through every decision step of an EFA—choosing variables and participants, screening data, judging whether EFA is appropriate, selecting the model, extraction method, number of factors, rotation, interpretation, and reporting—each illustrated with annotated SPSS screenshots, syntax, downloadable datasets, and scholarly citations. With minimal mathematics and a calm, jargon-light tone, the book equips students and seasoned researchers alike to produce defensible, replicable factor-analytic results and to respond confidently to editorial reviews.
The story it tells the reader
The reader An applied researcher or graduate student who wants to conduct a credible, publishable exploratory factor analysis in SPSS.
External problem
They must make many technical EFA decisions in SPSS with little training and unsound software defaults.
Internal problem
They feel uncertain, intimidated by the math, and worried their analysis is wrong or indefensible.
Philosophical problem
Sloppy, default-driven factor analysis distorts science by creating false certainty and non-replicable results, which is just plain wrong.
The plan
- Follow the ten-step EFA decision checklist in order.
- Screen data and verify EFA is appropriate before analyzing.
- Choose the common factor model with a justified extraction method.
- Use multiple criteria (parallel analysis, MAP, scree, theory) to decide factor number.
- Apply oblique rotation, interpret competing models, and report every decision transparently.
Success
- The reader produces defensible, replicable EFA results.
- They can justify every analytic choice to reviewers with citations.
- They confidently interpret, name, and report factors and understand when to use EFA versus CFA.
At stake
- The reader accepts unsound defaults and produces distorted, meaningless solutions.
- Their flawed results mislead theory and instrument development and fail to replicate.
- They are unable to defend their methods against editorial review.
Model of the world · 13 constructs · 14 relations
A process model in which the quality of an exploratory factor analysis solution depends on a sequence of researcher decisions (design levers) operating on data conditions, mediated by adherence to evidence-based practice and the appropriateness of the correlation structure, producing interpretable, replicable, well-reported factor solutions.
Design levers
Intermediate states & behaviors
Outcomes
- Factor Retention Accuracy
- Variable Selection Quality
- Data Screening Rigor
- Correlation Type Appropriateness
- Common Factor Model Choice
- +2 more
- Adherence to Evidence-Based Practice
- Solution Interpretability
- Reporting Transparency
- Replicability and Construct Validity
Design levers
- Factor Retention Accuracy
- Variable Selection Quality
- Data Screening Rigor
- Correlation Type Appropriateness
- Common Factor Model Choice
- +2 more
Intermediate states & behaviors
- Adherence to Evidence-Based Practice
Outcomes
- Solution Interpretability
- Reporting Transparency
- Replicability and Construct Validity
Moderators / context: Correlation Matrix Appropriateness · Sample Adequacy
Variable Selection Qualitydesign lever
The degree to which the measured variables included in the analysis adequately and validly sample the domain of interest with sufficient reliability and at least three indicators per anticipated factor.
Sample Adequacycontextual condition
The degree to which the participant sample is appropriate in representativeness and sufficiently large given communality, factor overdetermination, data type, and missingness to yield stable factor recovery.
Data Screening Rigordesign lever
The thoroughness with which linearity, distributional normality, outliers, restricted range, and missing data are inspected and appropriately handled prior to factor analysis using both statistics and graphics.
Correlation Matrix Appropriatenesscontextual condition
The extent to which the correlation matrix contains sufficient common variance for factoring, evidenced by coefficients at or above .30, an acceptable determinant, statistically significant Bartlett's test, and adequate KMO sampling adequacy values.
Correlation Type Appropriatenessdesign lever
The degree to which the type of correlation coefficient used (Pearson versus polychoric or other) matches the measurement level and distributional characteristics of the variables, especially for ordinal or nonnormal data.
Common Factor Model Choicedesign lever
The decision to use the common factor model (EFA) rather than principal components analysis when the goal is to represent latent structure by separating common variance from unique and error variance.
Extraction Method Appropriatenessdesign lever
The degree to which the chosen factor extraction method (e.g., maximum likelihood versus least-squares/principal axis) matches the data's distributional assumptions, sample size, and factor strength to recover factors accurately.
Factor Retention Accuracydesign lever
The degree to which the number of factors retained matches the true latent dimensionality, determined using convergent evidence from parallel analysis, minimum average partial, scree, theory, and a model-selection comparison rather than discredited single rules.
Rotation Appropriatenessdesign lever
The suitability of the rotation choice—favoring oblique rotations that allow correlated factors—for improving interpretability and honoring the typical intercorrelation among social-science constructs.
Adherence to Evidence-Based Practicebehavioral pattern
The overall extent to which the researcher follows documented best-practice recommendations across all decision steps rather than accepting unsound software defaults or arbitrary conventions.
Solution Interpretabilityoutcome metric
The degree to which the resulting factor solution exhibits approximate simple structure, salient and theoretically coherent loadings, adequate scale reliability, and small residuals without symptoms of over- or underextraction.
Reporting Transparencyoutcome metric
The completeness and clarity with which all analytic decisions, software, statistics, and results are reported so that an independent reader could review, replicate, and accumulate knowledge from the study.
Replicability and Construct Validityoutcome metric
The ultimate scientific value of the factor solution, reflected in its reproducibility across samples and methods and the meaningfulness of its relationships with external criteria within a construct-validation program.
How they connect
- variable selection quality → influences correlation matrix appropriateness
- sample adequacy → influences correlation matrix appropriateness
- data screening rigor → influences correlation matrix appropriateness
- correlation type appropriateness → moderates correlation matrix appropriateness
- correlation matrix appropriateness → predicts factor number accuracy
- model choice common factor → influences solution interpretability
- extraction method fit → influences solution interpretability
- factor number accuracy → predicts solution interpretability
- rotation appropriateness → influences solution interpretability
- evidence based adherence → influences factor number accuracy
- evidence based adherence → mediates solution interpretability
- solution interpretability → predicts replicability validity
- reporting transparency → influences replicability validity
- evidence based adherence → influences reporting transparency
Possible measures & feedback loops
A candidate team / org survey built from this book’s model — exploratory operationalizations, not validated instruments. Where a construct maps to a validated measure in Principia, we’ll point to that instead.
Variable Selection Quality
reliability coefficients; indicators-per-factor count; communality estimates
self-report suitability: low
Sample Adequacy
N; participant:variable ratio; communality x overdetermination interaction
self-report suitability: low
Data Screening Rigor
skew/kurtosis values; outlier counts; percent missing
self-report suitability: low
Correlation Matrix Appropriateness
KMO value; Bartlett's chi-square/p; determinant; proportion of r ≥ .30
self-report suitability: none
Correlation Type Appropriateness
number of ordered categories; skew/kurtosis; matrix type used
self-report suitability: none
Common Factor Model Choice
model type reported; communality estimation method
self-report suitability: none
Extraction Method Appropriateness
method reported; Heywood-case occurrence; iterations to convergence
self-report suitability: none
Factor Retention Accuracy
criteria agreement count; real vs random eigenvalues; MAP minimum
self-report suitability: none
Rotation Appropriateness
rotation type reported; interfactor correlations; pattern/structure coefficients
self-report suitability: none
Adherence to Evidence-Based Practice
proportion of steps with stated rationale; default-avoidance count
self-report suitability: medium
Solution Interpretability
RMSR; count of residuals > .10; salient-loading pattern; alpha/omega
self-report suitability: none
Reporting Transparency
checklist element coverage; presence of software/version and matrices
self-report suitability: medium
Replicability and Construct Validity
congruence across samples; stability across methods; external correlate magnitudes
self-report suitability: none
Frameworks & instruments in this book
- Parsimony: explain the most common variance with the fewest interpretable factors.
- Simple structure: each variable should load saliently on as few factors as possible.
- Evidence-based decision-making over reliance on software defaults.
- Transparency and replicability in reporting all analytic choices.
- Models only approximate reality; factors must be validated and not reified.
Several of these are operationalized as tools in the People Analytics Toolbox.
Topics
- applied statistics
- research methods
Related in the library
- 12_ The Elements of Great ManagingRodd Wagner & James HarterStatistics
- Antifragile (Incerto)Nassim Nicholas TalebStatistics
- Big Data_ A Very Short Introduction (Very Short Introductions)Dawn E. HolmesStatistics
- CompensationLance A. Berger & Dorothy BergerStatistics
- Compensation and Benefit DesignBashker D. BiswasStatistics
- Cultures and Organizations_ Software of the Mind, Third EditionGeert Hofstede, Gert Jan Hofstede & Michael MinkovStatistics