 of public opinion and
survey research professionals
American Association for Public Opinion Research

Sampling and Weighting

Member Price: \$175.00
Nonmember Price: \$235.00

Student pricing available

Survey Weighting: Goals and Methods

Survey data sets usually come with at least one analysis weight for each respondent record in the sample. Analysts interested in calculating population estimates are told to use the same set of weights for all analyses—means, totals, linear and nonlinear models, etc. Analysis weights are designed to: 1. Account for the probabilities used to select units (in cases where random sampling is used); 2. Adjust in cases where it cannot be determined whether some sample units are members of the population under study; 3. Adjust for eligible units that do not respond to the survey to limit the effects of nonresponse bias; 4. Incorporate external data to reduce standard errors of estimates and to compensate when the sample does not correctly cover the desired population. Survey statisticians usually think of weighting in the context of probability samples, where units are selected by some random means from a well-defined population. All four steps above can be applied to probability samples. However, because of the current popularity of volunteer web panels and other kinds of “found” data, how to weight nonprobability samples is also worth considering. This webinar will review different techniques used in weighting, including formation of adjustment classes for unknown eligibility and nonresponse based on combinations of covariates, response propensity estimates, and regression trees. Use of auxiliary frame or population data to reduce variances and correct for coverage errors will be covered. We will also review how some of the same techniques can be used to weight nonprobability samples. Throughout we will give examples of how R and Stata can be used to compute weights.

Learning Objectives:

• Understand the different steps in weighting and the reasoning behind each
• Understand how weights are used to correct for coverage errors and nonresponse
• Understand how weighting approaches differ for probability and nonprobability samples

Non-probability Sampling for Finite Population Inference

Although selecting a probability sample has been the standard for decades for making inferences from a sample to a finite population, incentives are increasing to use data obtained without a defined sampling mechanism, i.e., non-probability samples. In a world of “big data”, large amounts of data are readily available through methods that are faster and need fewer resources relative to most probability-based designs.  There are many ways of collecting data now without a pre-specified sampling design—volunteer web panels, tele-voting, expert selection, respondent-driven network sampling, and others—none of which require probability samples.

Design-based inference, in which population values are estimated through the random sampling procedure specified by the sampler, cannot be used for non-probability samples. One alternative is quasi-randomization where pseudo-inclusion probabilities (referred to as propensity scores) are estimated from covariates available for both sample and nonsample units. Another estimation approach is superpopulation modeling; analytic variables collected on the sample units are used in a model to predict values for the nonsample units. Variances of estimators can be computed using replication methods or approaches derived using modelling. We include several simulation studies to illustrate the properties of these approaches and discuss the pros and cons of each.

Learning Objectives:
• Understand the different types of non-probability samples currently in use
• Understand how non-probability samples can be affected by coverage errors, nonresponse, and measurement errors
• Understand what methods of estimation can be used for non-probability samples and the arguments used to justify them

Design and Weighting for Dual Frame Surveys

The course will describe the reasons for considering dual frame surveys and the conditions under which the design is efficient. Dual frame designs with screening and overlapping units will be defined and the benefits and problems associated with each type of design will be discussed. Approaches to weighting dual frame surveys will be outlined, with an emphasis on the types of information needed to produce the weights and the types of errors (sampling and nonsampling errors) that are typically encountered in practice.

Learning Objectives