library / lib9df977d542007d6d
Introduction to Survey Sampling (Quantitative Applications in the Social Sciences)
In a sentence
A concise, practical guide to designing and analyzing probability sample surveys, balancing sampling theory with the real-world problems of frames, nonresponse, and complex designs.
Graham Kalton's Introduction to Survey Sampling distills the essential techniques of probability sampling into a highly readable text for researchers who use surveys but are not statisticians. Beginning with simple random sampling, it builds systematically through systematic sampling, stratification, clustering, multistage and probability-proportional-to-size designs, then confronts the messy realities of imperfect sampling frames, nonresponse, weighting, and the estimation of sampling errors from complex designs. Two worked examples (a national face-to-face survey and a telephone RDD survey) and a discussion of nonprobability and quota sampling show how the pieces combine in practice. The book teaches the reader to weigh precision against cost, to recognize when standard formulas mislead, and to anticipate the practical pitfalls that can ruin an otherwise well-conceived study.
The story it tells the reader
The reader A social-science researcher or survey practitioner who wants to draw valid, efficient samples and produce trustworthy population estimates.
External problem
Designing a sample that yields precise, unbiased estimates within budget while coping with imperfect frames and nonresponse.
Internal problem
Feeling that sampling is an intimidating technical black box best left to statisticians.
Philosophical problem
It is wrong to let a poorly designed sample undermine otherwise careful research; researchers should understand the foundation of their evidence.
The plan
- Define the target and survey populations carefully.
- Choose an appropriate probability design (SRS, systematic, stratified, cluster, multistage, PPS).
- Build and assess the sampling frame, handling missing, clustered, blank, and duplicate listings.
- Minimize and compensate for nonresponse.
- Apply weights and compute sampling errors appropriate to the design.
- Determine sample size from precision, design effect, and nonresponse, balancing cost.
Success
- Surveys produce defensible, precise estimates with quantified uncertainty.
- The researcher confidently navigates frame and nonresponse problems and complex designs.
- Resources are used efficiently, matching precision to need and budget.
At stake
- Selection bias and frame errors render results untrustworthy.
- Sampling errors are misstated, overstating precision.
- Time and money are wasted on a sample whose results cannot support valid inference.
Model of the world · 11 constructs · 12 relations
A framework linking sample design choices and survey conditions to intermediate statistical and operational states (probabilistic coverage, frame quality, response, design effect) that determine the bias, precision, and cost of survey estimates.
Design levers
Intermediate states & behaviors
Outcomes
- Sample Size
- Clustering / Multistage Structure
- Probability Design Choice
- Stratification
- Design Effect
- Equality of Selection Probabilities
- Response Rate
- Estimate Precision
- Estimate Bias
- Survey Cost
Design levers
- Sample Size
- Clustering / Multistage Structure
- Probability Design Choice
- Stratification
Intermediate states & behaviors
- Design Effect
- Equality of Selection Probabilities
- Response Rate
Outcomes
- Estimate Precision
- Estimate Bias
- Survey Cost
Moderators / context: Sampling Frame Quality
Probability Design Choicedesign lever
The selection of a probability sampling scheme (SRS, systematic, stratified, cluster, multistage, PPS) that assigns each population element a known nonzero selection probability, forming the structural backbone of the sample.
Stratificationdesign lever
The classification of the population into internally homogeneous strata from which separate samples are drawn, controlling sample sizes by stratum to improve precision or guarantee domain estimates.
Clustering / Multistage Structuredesign lever
The use of grouped sampling units (clusters, PSUs) in which only a sample of clusters is selected and elements subsampled within them, trading reduced precision for substantial cost economies in data collection.
Sampling Frame Qualitycontextual condition
The degree to which the frame lists each population element once and only once, free of missing elements, clusters of elements, blanks/foreign elements, and duplicate listings, determining coverage of the target population.
Response Ratebehavioral pattern
The proportion of eligible sampled elements from which usable data are obtained, reflecting success in avoiding refusals, noncontacts, and incapacity, and bounding potential nonresponse bias.
Equality of Selection Probabilitiespsychological state
The extent to which the realized design is epsem (equal probability of selection), since unequal probabilities from frame or design features necessitate weighting and typically reduce precision.
Design Effectpsychological state
The ratio of the variance of an estimator under the complex design to its variance under simple random sampling of the same size, summarizing how stratification, clustering, and unequal weights jointly affect precision.
Sample Sizedesign lever
The number of elements from which data are collected, the single most important determinant of sampling variance for large populations and a primary cost driver subject to precision and budget trade-offs.
Estimate Precisionoutcome metric
The smallness of the sampling error (standard error / variance) of a survey estimator, determining the width of confidence intervals around population parameters such as means and proportions.
Estimate Biasoutcome metric
The systematic deviation of the expected value of a survey estimator from the true population parameter, arising chiefly from frame noncoverage, nonresponse, and selection bias.
Survey Costoutcome metric
The total resources (money, time, interviewer effort, travel) required to execute the sample design and data collection, which constrains achievable sample size and precision.
How they connect
- stratification use − influences design effect
- clustering use → influences design effect
- clustering use − influences survey cost
- design effect − influences estimate precision
- sample size → predicts estimate precision
- sample size → predicts survey cost
- probability design choice → influences selection probability equality
- selection probability equality − influences design effect
- sampling frame quality − influences estimate bias
- response rate − influences estimate bias
- sampling frame quality → influences selection probability equality
- estimate precision → influences sample size
Possible measures & feedback loops
A candidate team / org survey built from this book’s model — exploratory operationalizations, not validated instruments. Where a construct maps to a validated measure in Principia, we’ll point to that instead.
Probability Design Choice
design type classification; selection equation parameters
self-report suitability: low
Stratification
number of strata; sampling fraction by stratum
self-report suitability: none
Clustering / Multistage Structure
cluster size; subsample size b; number of stages
self-report suitability: none
Sampling Frame Quality
coverage rate; blank rate; duplicate rate
self-report suitability: none
Response Rate
completed/eligible ratio; refusal proportion; contact attempts
self-report suitability: low
Equality of Selection Probabilities
coefficient of variation of weights; weight range
self-report suitability: none
Design Effect
v(z)/v(z0) ratio; intraclass correlation rho
self-report suitability: none
Sample Size
n records; n by domain
self-report suitability: none
Estimate Precision
standard error; confidence interval width; coefficient of variation
self-report suitability: none
Estimate Bias
Wm times subgroup difference; benchmark comparison
self-report suitability: none
Survey Cost
cost per cluster; cost per element; total budget
self-report suitability: low
Frameworks & instruments in this book
- Define an ideal target population first, then explicitly note exclusions to form the survey population.
- Each element must have a known, nonzero probability of selection for valid inference.
- Form strata to be internally homogeneous; form clusters to be internally heterogeneous.
- Use the design effect to translate between complex-design and simple-random-sampling precision.
- Keep nonresponse small because its bias is the product of nonresponse rate and respondent-nonrespondent difference.
- Choose between designs by balancing precision against survey cost.
Several of these are operationalized as tools in the People Analytics Toolbox.
Topics
- applied statistics
- research methods
Related in the library
- Statistics_ A Very Short Introduction (Very Short Introductions)David J. HandStatistics
- SURVEY & QUESTIONNAIRE DESIGN_ Collecting Primary Data to Answer Research Questions (55)Jane Bourke, Ann Kirby & Justin DoranStatistics
- 12_ The Elements of Great ManagingRodd Wagner & James HarterStatistics
- Antifragile (Incerto)Nassim Nicholas TalebStatistics
- Big Data_ A Very Short Introduction (Very Short Introductions)Dawn E. HolmesStatistics
- CompensationLance A. Berger & Dorothy BergerStatistics