What is PeopleAnalyst?

PeopleAnalyst is the front door for people-analytics research: 205+ works indexed and profiled, 40+ citation-grade findings extracted, and peer-reviewed behavioral science translated from academic to actionable — the missing manual for the people analytics you always meant to do.

What is people analytics?

People analytics is not a dashboard. It is behavioral science and statistical inference applied to workforce decisions — a discipline with its own methodology, spanning measurement, organizational design, talent, leadership, and analytics craft.

Why does AI in HR need measurement science?

AI is being deployed in high-stakes people decisions — hiring, performance, attrition — without the measurement science to evaluate whether it works or whom it harms. Construct validity, effect sizes, and criterion validity are the vocabulary for asking an AI vendor the right questions.

How is the research made accessible?

The evidence is indexed and searchable: 205+ works, 40+ citation-grade insight cards, and 8 research arcs, so the right finding reaches the right decision at the right time.

What separates good people measurement from assertion?

Good measurement has a method: construct validity, reliability, and effect-size interpretation are not optional — they are what separates evidence from assertion.

library / lib6b36e59b752dba0e

Psychometric Theory

Jum C. Nunnally, Ira H. Bernstein

In a sentence

A comprehensive textbook for graduate students and researchers on the theory and statistical methods for creating, evaluating, and applying psychological measures, covering both classical and modern approaches.

The third edition of Nunnally's "Psychometric Theory" stands as a cornerstone text, updated by Ira Bernstein to bridge the gap between classical test theory and modern measurement innovations. This comprehensive guide is essential for graduate students and researchers in psychology, education, and business who need to construct or evaluate quantitative measures. It systematically builds from fundamental statistical concepts to advanced topics like item response theory, generalizability theory, and structural equation modeling. The book's strength lies in its emphasis on core principles, providing a robust framework for understanding measurement error, validity, reliability, and factor analysis. It doesn't just present formulas; it fosters a deep conceptual understanding of why and how psychological tests work, empowering readers to create scientifically sound instruments and critically assess the vast landscape of existing measures.

The four lenses

Science
Statistics
Systems
Strategy

The model

This model, implicit in psychometric theory, illustrates how the design characteristics of a multi-item measure (such as its length and the properties of its items) influence its reliability, which in turn is a necessary prerequisite for establishing its validity and ultimate scientific utility.

Test Lengthdesign lever

The number of discrete components (items) that are aggregated to form a composite measure.

Average Item Inter-correlationdesign lever

The average degree of linear relationship among the items within a measure, reflecting the extent to which they measure something in common.

Content Homogeneitydesign lever

The degree to which the items in a measure tap into a single, unitary psychological attribute or domain of content.

Methodological Heterogeneitydesign lever

The use of diverse methods, item formats, or situations to measure a construct, ensuring the resulting measure is not confounded with a specific method and possesses greater generalizability.

Measurement Reliabilitypsychological state

The extent to which a measure is free from random measurement error, reflecting its consistency and precision. It is formally the ratio of true score variance to observed score variance.

Construct Validityoutcome metric

The degree to which a measure accurately reflects the theoretical construct it is intended to measure, demonstrated through a network of evidence about its internal structure and external relationships.

Predictive Validityoutcome metric

The degree to which a measure accurately forecasts a specific, external criterion behavior.

Content Validityoutcome metric

The degree to which the content of a measure represents an adequate and representative sample of a defined domain of content or behavior.

Scientific Utilityoutcome metric

The overall usefulness of a measure for advancing scientific understanding or solving practical problems, representing the ultimate goal of measurement.

How they connect

test length → influences measurement reliability
item intercorrelation → influences measurement reliability
content homogeneity → influences item intercorrelation
measurement reliability → influences construct validity
measurement reliability → influences predictive validity
methodological heterogeneity → influences construct validity
construct validity → predicts scientific utility
predictive validity → predicts scientific utility
content validity → predicts scientific utility

The story

The reader Graduate students and researchers in psychology, education, and related behavioral sciences who need to create, evaluate, or apply quantitative measures of human attributes. They want to conduct rigorous, defensible research and make sound decisions, but are often unsure how to navigate the complex statistical landscape of psychometrics.

External problem

Developing or selecting a good psychological measure is difficult. It requires navigating complex statistical concepts, choosing among different theoretical models (classical vs. modern), and rigorously assessing properties like reliability and validity.

Internal problem

They feel intimidated by the mathematical complexity of measurement theory and uncertain about the quality of their own or others' measures. They fear their research might be built on a shaky foundation, leading to invalid conclusions and wasted effort.

Philosophical problem

It's just plain wrong for scientific progress and important real-world decisions to be hindered by poorly understood or improperly constructed measurement tools. Rigorous measurement is the bedrock of scientific psychology.

The plan

Establish a firm grasp of the fundamental concepts of measurement, scales, statistics, and validity.
Master Classical Test Theory to understand measurement error and build reliable multi-item scales using techniques like domain sampling and Cronbach's alpha.
Learn to use factor analysis (both exploratory and confirmatory) to uncover and test the underlying structure of your measures and constructs.
Explore modern approaches, including Item Response Theory (IRT) and other advanced statistical models, to tackle specialized measurement challenges like test bias and adaptive testing.

Success

Confidently design and validate high-quality psychological measures.
Critically evaluate the psychometric properties of instruments used in research and practice.
Produce more rigorous, replicable, and theoretically meaningful research findings.
Make more accurate, fair, and defensible decisions in applied settings like education, industry, and clinical practice.

At stake

Continue to use measurement tools without fully understanding their properties or limitations.
Produce research with questionable validity that fails to replicate.
Risk making flawed decisions about individuals based on unreliable or invalid test scores.
Remain on the sidelines of quantitative research, unable to fully participate in or critique the methods that drive the field.

Questions this book answers

What are the fundamental principles of scientific measurement and the different types of measurement scales?
How is the validity of a psychological measure established across its different forms: content, construct, and predictive?
What is the statistical foundation of psychometric theory, including correlation, regression, and the properties of linear combinations?
How does Classical Test Theory (CTT), particularly the domain-sampling model, conceptualize and quantify measurement error to assess reliability?
What are the practical steps and statistical procedures for constructing conventional multi-item tests, from item writing and analysis to norming?

Glossary

Test Length: The number of discrete components (items) that are aggregated to form a composite measure. According to the domain-sampling model, a longer test provides a larger and more stable sample of the content domain.
Average Item Inter-correlation: The average degree of linear relationship among the items within a measure. It reflects the extent to which the items share a common core or underlying factor.
Content Homogeneity: The degree to which the items in a measure tap into a single, unitary, and clearly defined psychological attribute or domain of content. A homogeneous measure is unidimensional.
Methodological Heterogeneity: The degree to which a construct is measured using a diversity of methods, item formats, or situations. This practice ensures that the measured construct is not merely an artifact of a single method and enhances its generalizability.
Measurement Reliability: The extent to which a measure is free from random measurement error, reflecting its consistency, repeatability, and precision. Formally, it is the proportion of observed score variance that is attributable to true score variance.
Construct Validity: The degree to which a measure faithfully represents the theoretical construct it purports to assess. It is the central, unifying concept of validity, supported by a cumulative network of evidence regarding a measure's meaning.
Predictive Validity: The degree to which a measure accurately forecasts a specific, external criterion behavior. It is a pragmatic form of validity focused on the functional relationship between a predictor and an outcome.
Content Validity: The degree to which the items or content of a measure constitute a representative and adequate sample of a defined domain of knowledge, skill, or behavior.

Related in the library

Tools these methods power