library / libe0b44992f577a9f3
Design, Evaluation, and Analysis of Questionnaires for Survey Research (Wiley Series in Survey Methodology)
In a sentence
A systematic, science-based method for designing survey questions, predicting their measurement quality before data collection, and correcting for measurement error in analysis.
This book transforms questionnaire design from an 'art' into a scientific activity. Saris and Gallhofer present a complete program: a three-step procedure for operationalizing complex concepts into concrete survey requests, a thorough mapping of the design choices researchers face (response scales, item structure, batteries, data collection mode), and a rigorous framework for estimating the reliability, validity, and method effects of survey questions using multitrait-multimethod (MTMM) experiments. The crowning achievement is the SQP (Survey Quality Predictor) program, built on a meta-analysis of thousands of MTMM experiments across dozens of countries and languages, which predicts the quality of any survey question from its coded characteristics—before it is ever fielded. The authors then show how to use these quality estimates to correct for measurement error in substantive analyses and in cross-cultural comparisons, demonstrating that ignoring measurement quality leads to seriously biased conclusions about relationships and means.
The story it tells the reader
The reader A survey researcher or social scientist who wants to design questionnaires that accurately measure the concepts they care about and yield trustworthy results.
External problem
Survey questions contain measurement error that biases estimates of relationships and means, and there is no easy way to know a question's quality before fielding it.
Internal problem
The researcher feels uncertain whether their data truly measure what they intend and anxious that their conclusions may be artifacts of poor questions.
Philosophical problem
Treating questionnaire design as an unteachable 'art' is wrong when scientific methods can make the consequences of design choices known and controllable.
The plan
- Operationalize complex concepts into concrete requests using the three-step procedure.
- Make informed choices about response scales, item structure, batteries, and data collection mode.
- Predict the quality of each question with SQP before fielding and improve weak questions.
- Estimate reliability, validity, and method effects where possible via MTMM designs.
- Correct correlations and estimates for measurement error in substantive analysis.
- Test equivalence and correct for measurement quality before cross-cultural comparison.
Success
- The researcher fields higher-quality questions, knows their measurement quality in advance, obtains less biased estimates of relationships and means, and can make valid cross-cultural comparisons.
At stake
- The researcher unknowingly uses low-quality questions, draws biased conclusions, attributes measurement artifacts to substantive differences, and produces non-comparable cross-national results.
Model of the world · 12 constructs · 18 relations
A causal-path model in which question design choices and contextual conditions (language, country, mode) influence the psychological response process and method reactions, which in turn determine measurement quality (reliability, validity, method effects), and ultimately the accuracy of substantive estimates of relationships and means.
Design levers
Intermediate states & behaviors
Outcomes
- Question Design Choices
- Cognitive Response Process
- Method Reaction
- Total Measurement Quality
- Validity
- Accuracy of Substantive Estimates
- Reliability
- Cross-Cultural Comparability
- +3 more
Design levers
- Question Design Choices
Intermediate states & behaviors
- Cognitive Response Process
- Method Reaction
Outcomes
- Total Measurement Quality
- Validity
- Accuracy of Substantive Estimates
- Reliability
- Cross-Cultural Comparability
- +3 more
Moderators / context: Contextual Conditions
Question Design Choicesdesign lever
The set of controllable formulation decisions for a survey request, including request type (direct/indirect, WH-word), response scale type and number of categories, labeling, balance, use of batteries, additional components, and fixed reference points.
Contextual Conditionscontextual condition
The survey context not chosen freely by the designer, including the country, the language of administration and translation, the data collection mode, and the position of the question within the questionnaire.
Cognitive Response Processpsychological state
The internal brain process triggered by a request that converts the respondent's underlying opinion on the concept of interest into a preliminary reaction, characterized by an intercept and slope linking the latent concept to the reaction.
Method Reactionpsychological state
The systematic, respondent-specific reaction to the particular method used to express an answer (e.g., scale type, agree/disagree format), producing common method variance shared across items measured with the same method.
Reliabilityoutcome metric
The strength of the relationship between the observed response and its true score, reflecting the degree to which the measure is free of random measurement error; the reliability coefficient squared is the proportion of observed variance due to the true score.
Validityoutcome metric
The strength of the relationship between the latent trait of interest and the true score, reflecting freedom from systematic method error; its complement is the method effect, since validity squared plus method effect squared equals one.
Total Measurement Qualityoutcome metric
The overall strength of the relationship between the observed variable and the latent concept of interest, equal to the product of reliability and validity (quality coefficient squared); it determines how much observed variance reflects the intended construct.
Item Nonresponseoutcome metric
The extent of missing values on a survey item, which disrupts analysis and can render results unrepresentative of the population; treated as a basic quality criterion alongside bias.
Response Biasoutcome metric
A systematic difference between the real values of the variable of interest and the observed scores corrected for random error, reflected in shifted response distributions across methods.
Composite Score Qualityoutcome metric
The quality of an aggregated measure (sum or weighted composite) of multiple concepts-by-intuition used to represent a concept-by-postulation, derived from the qualities of its component indicators and the chosen weights.
Accuracy of Substantive Estimatesoutcome metric
The degree to which estimated relationships between variables and comparisons of means reflect true population values rather than artifacts of measurement error; improved by correcting correlations and estimates using quality information.
Cross-Cultural Comparabilityoutcome metric
The extent to which measures have equivalent meaning across countries and languages (configural, metric, scalar, or cognitive equivalence), enabling valid comparison of means and relationships across groups.
How they connect
- question design choices → influences cognitive response process
- question design choices → influences method reaction
- question design choices → predicts reliability
- question design choices → predicts validity
- contextual conditions → moderates reliability
- contextual conditions → moderates cognitive response process
- cognitive response process → influences validity
- method reaction − influences validity
- method reaction → influences response bias
- reliability → predicts total measurement quality
- validity → predicts total measurement quality
- total measurement quality → predicts composite score quality
- total measurement quality → influences analytic accuracy
- composite score quality → influences analytic accuracy
- item nonresponse − influences analytic accuracy
- contextual conditions → influences cross cultural comparability
- total measurement quality → influences cross cultural comparability
- cross cultural comparability → influences analytic accuracy
Possible measures & feedback loops
A candidate team / org survey built from this book’s model — exploratory operationalizations, not validated instruments. Where a construct maps to a validated measure in Principia, we’ll point to that instead.
Question Design Choices
SQP coded characteristics; scale length; labeling completeness
self-report suitability: none
Contextual Conditions
country code; language code; mode code; position index
self-report suitability: none
Cognitive Response Process
estimated cognitive slope; estimated cognitive intercept
self-report suitability: low
Method Reaction
method factor loading; common method variance
self-report suitability: none
Reliability
reliability coefficient; SQP-predicted reliability
self-report suitability: none
Validity
validity coefficient; SQP-predicted validity
self-report suitability: none
Total Measurement Quality
quality coefficient squared; SQP-predicted quality
self-report suitability: none
Item Nonresponse
missingness rate
self-report suitability: none
Response Bias
distribution differences across methods; deviation from factual benchmarks
self-report suitability: none
Composite Score Quality
composite-latent correlation; invalidity due to method in composite
self-report suitability: none
Accuracy of Substantive Estimates
corrected vs uncorrected effect sizes; corrected vs uncorrected explained variance
self-report suitability: none
Cross-Cultural Comparability
equality of slopes/intercepts; JRule judgments; power-aware misspecification indicators
self-report suitability: none
Frameworks & instruments in this book
- Operationalization should proceed deliberately from concept-by-postulation to concept-by-intuition to assertion to request for an answer.
- Every design decision has a measurable consequence for question quality.
- Quality must be defined and estimated separately as reliability, validity, and method effect.
- Measurement error is unavoidable and must be corrected for in analysis, not ignored.
- Comparisons across methods, countries, and languages are valid only after establishing equivalence and correcting for measurement quality.
- Prefer simpler, higher-quality formulations (item-specific, fewer components, fixed reference points) over complex or battery formats when feasible.
Several of these are operationalized as tools in the People Analytics Toolbox.
Topics
- applied statistics
- research methods
Related in the library
- Reliability and Validity AssessmentEdward G. Carmines & Richard A. ZellerStatistics
- 12_ The Elements of Great ManagingRodd Wagner & James HarterStatistics
- Antifragile (Incerto)Nassim Nicholas TalebStatistics
- Big Data_ A Very Short Introduction (Very Short Introductions)Dawn E. HolmesStatistics
- CompensationLance A. Berger & Dorothy BergerStatistics
- Compensation and Benefit DesignBashker D. BiswasStatistics