peopleanalyst

library / libe0b44992f577a9f3

Design, Evaluation, and Analysis of Questionnaires for Survey Research (Wiley Series in Survey Methodology)

In a sentence

A systematic, science-based method for designing survey questions, predicting their measurement quality before data collection, and correcting for measurement error in analysis.

This book transforms questionnaire design from an 'art' into a scientific activity. Saris and Gallhofer present a complete program: a three-step procedure for operationalizing complex concepts into concrete survey requests, a thorough mapping of the design choices researchers face (response scales, item structure, batteries, data collection mode), and a rigorous framework for estimating the reliability, validity, and method effects of survey questions using multitrait-multimethod (MTMM) experiments. The crowning achievement is the SQP (Survey Quality Predictor) program, built on a meta-analysis of thousands of MTMM experiments across dozens of countries and languages, which predicts the quality of any survey question from its coded characteristics—before it is ever fielded. The authors then show how to use these quality estimates to correct for measurement error in substantive analyses and in cross-cultural comparisons, demonstrating that ignoring measurement quality leads to seriously biased conclusions about relationships and means.

The story it tells the reader

The reader A survey researcher or social scientist who wants to design questionnaires that accurately measure the concepts they care about and yield trustworthy results.

External problem

Survey questions contain measurement error that biases estimates of relationships and means, and there is no easy way to know a question's quality before fielding it.

Internal problem

The researcher feels uncertain whether their data truly measure what they intend and anxious that their conclusions may be artifacts of poor questions.

Philosophical problem

Treating questionnaire design as an unteachable 'art' is wrong when scientific methods can make the consequences of design choices known and controllable.

The plan

  1. Operationalize complex concepts into concrete requests using the three-step procedure.
  2. Make informed choices about response scales, item structure, batteries, and data collection mode.
  3. Predict the quality of each question with SQP before fielding and improve weak questions.
  4. Estimate reliability, validity, and method effects where possible via MTMM designs.
  5. Correct correlations and estimates for measurement error in substantive analysis.
  6. Test equivalence and correct for measurement quality before cross-cultural comparison.

Success

  • The researcher fields higher-quality questions, knows their measurement quality in advance, obtains less biased estimates of relationships and means, and can make valid cross-cultural comparisons.

At stake

  • The researcher unknowingly uses low-quality questions, draws biased conclusions, attributes measurement artifacts to substantive differences, and produces non-comparable cross-national results.

Model of the world · 12 constructs · 18 relations

A causal-path model in which question design choices and contextual conditions (language, country, mode) influence the psychological response process and method reactions, which in turn determine measurement quality (reliability, validity, method effects), and ultimately the accuracy of substantive estimates of relationships and means.

Design levers

  • Question Design Choices

Intermediate states & behaviors

  • Cognitive Response Process
  • Method Reaction

Outcomes

  • Total Measurement Quality
  • Validity
  • Accuracy of Substantive Estimates
  • Reliability
  • Cross-Cultural Comparability
  • +3 more

Moderators / context: Contextual Conditions

Consolidated shape of the book’s model — full constructs and relationships below.

Question Design Choicesdesign lever

The set of controllable formulation decisions for a survey request, including request type (direct/indirect, WH-word), response scale type and number of categories, labeling, balance, use of batteries, additional components, and fixed reference points.

Contextual Conditionscontextual condition

The survey context not chosen freely by the designer, including the country, the language of administration and translation, the data collection mode, and the position of the question within the questionnaire.

Cognitive Response Processpsychological state

The internal brain process triggered by a request that converts the respondent's underlying opinion on the concept of interest into a preliminary reaction, characterized by an intercept and slope linking the latent concept to the reaction.

Method Reactionpsychological state

The systematic, respondent-specific reaction to the particular method used to express an answer (e.g., scale type, agree/disagree format), producing common method variance shared across items measured with the same method.

Reliabilityoutcome metric

The strength of the relationship between the observed response and its true score, reflecting the degree to which the measure is free of random measurement error; the reliability coefficient squared is the proportion of observed variance due to the true score.

Validityoutcome metric

The strength of the relationship between the latent trait of interest and the true score, reflecting freedom from systematic method error; its complement is the method effect, since validity squared plus method effect squared equals one.

Total Measurement Qualityoutcome metric

The overall strength of the relationship between the observed variable and the latent concept of interest, equal to the product of reliability and validity (quality coefficient squared); it determines how much observed variance reflects the intended construct.

Item Nonresponseoutcome metric

The extent of missing values on a survey item, which disrupts analysis and can render results unrepresentative of the population; treated as a basic quality criterion alongside bias.

Response Biasoutcome metric

A systematic difference between the real values of the variable of interest and the observed scores corrected for random error, reflected in shifted response distributions across methods.

Composite Score Qualityoutcome metric

The quality of an aggregated measure (sum or weighted composite) of multiple concepts-by-intuition used to represent a concept-by-postulation, derived from the qualities of its component indicators and the chosen weights.

Accuracy of Substantive Estimatesoutcome metric

The degree to which estimated relationships between variables and comparisons of means reflect true population values rather than artifacts of measurement error; improved by correcting correlations and estimates using quality information.

Cross-Cultural Comparabilityoutcome metric

The extent to which measures have equivalent meaning across countries and languages (configural, metric, scalar, or cognitive equivalence), enabling valid comparison of means and relationships across groups.

How they connect

  • question design choices influences cognitive response process
  • question design choices influences method reaction
  • question design choices predicts reliability
  • question design choices predicts validity
  • contextual conditions moderates reliability
  • contextual conditions moderates cognitive response process
  • cognitive response process influences validity
  • method reaction influences validity
  • method reaction influences response bias
  • reliability predicts total measurement quality
  • validity predicts total measurement quality
  • total measurement quality predicts composite score quality
  • total measurement quality influences analytic accuracy
  • composite score quality influences analytic accuracy
  • item nonresponse influences analytic accuracy
  • contextual conditions influences cross cultural comparability
  • total measurement quality influences cross cultural comparability
  • cross cultural comparability influences analytic accuracy

Possible measures & feedback loops

A candidate team / org survey built from this book’s model — exploratory operationalizations, not validated instruments. Where a construct maps to a validated measure in Principia, we’ll point to that instead.

Question Design Choices

SQP coded characteristics; scale length; labeling completeness

self-report suitability: none

Contextual Conditions

country code; language code; mode code; position index

self-report suitability: none

Cognitive Response Process

estimated cognitive slope; estimated cognitive intercept

self-report suitability: low

Method Reaction

method factor loading; common method variance

self-report suitability: none

Reliability

reliability coefficient; SQP-predicted reliability

self-report suitability: none

Validity

validity coefficient; SQP-predicted validity

self-report suitability: none

Total Measurement Quality

quality coefficient squared; SQP-predicted quality

self-report suitability: none

Item Nonresponse

missingness rate

self-report suitability: none

Response Bias

distribution differences across methods; deviation from factual benchmarks

self-report suitability: none

Composite Score Quality

composite-latent correlation; invalidity due to method in composite

self-report suitability: none

Accuracy of Substantive Estimates

corrected vs uncorrected effect sizes; corrected vs uncorrected explained variance

self-report suitability: none

Cross-Cultural Comparability

equality of slopes/intercepts; JRule judgments; power-aware misspecification indicators

self-report suitability: none

Preview the survey →

Frameworks & instruments in this book

  • Operationalization should proceed deliberately from concept-by-postulation to concept-by-intuition to assertion to request for an answer.
  • Every design decision has a measurable consequence for question quality.
  • Quality must be defined and estimated separately as reliability, validity, and method effect.
  • Measurement error is unavoidable and must be corrected for in analysis, not ignored.
  • Comparisons across methods, countries, and languages are valid only after establishing equivalence and correcting for measurement quality.
  • Prefer simpler, higher-quality formulations (item-specific, fewer components, fixed reference points) over complex or battery formats when feasible.

Several of these are operationalized as tools in the People Analytics Toolbox.

Topics

Related in the library