The leading association
of public opinion and
survey research professionals
American Association for Public Opinion Research

AGT Abstracts

Title: Statistical Data Integration
Authors: Ying Han and Partha Lahiri, University of Maryland, College Park
Abstract: Statistical data integration can be a potential approach to cut down costs in data collection by preventing the need to collect new survey data with all the necessary information.  Our interest in the topic has grown out of a small area estimation project. The project aims at borrowing strength from relevant auxiliary information in order to improve on estimates at different granular levels (e.g., small geographic areas) where survey data is sparse. However, the auxiliary information may not be recorded in the same survey data that contains the study variable, but is available in an administrative data set. Statistical data integration can be potentially used for better understanding of the nonresponse behavior and for constructing better nonresponse adjustments in public opinion polls. We may have very limited information on the nonrespondents from the sampling frame, but more valuable information about the non-respondents can be obtained if survey and administrative data can be liked together. This will enable us to build a better model to explain nonresponse with more variables, and further improve the nonresponse adjustments.

Record linkage techniques can be used for data integration by identifying records from multiple sources that represent the same entity (e.g., person) even in absence of a unique and error-free identifier (e.g. Social Security Number). Linkage errors are, however, inevitable and, if ignored, can lead to significant bias and increased variance of the estimate. Adjustments of statistical methods for these linkage errors are certainly challenging problems.

Our goals are to develop a general integrated model that propagates uncertainty of record linkage to the estimation stage, and demonstrate its ability to improve on the estimation. We propose a mixture-model based approach to link sample surveys to administrative records. To achieve our goal, we will first investigate the effect of linkage errors on estimation when linkage errors are ignored. Next, we will propose a model to characterize the randomness of the linkage process. The model is based on the idea that the unobserved values of auxiliary variables corresponding to surveyed units are an ordered subset of the observed values of these variables from the administrative records. The best prediction estimator of a mixed-effect parameter can be obtained by an estimating-function-based approach derived from the distribution of the response variable under the general integrated model. Due to the complexity of the model, we anticipate that it will be impossible to derive a closed-form variance expression. We propose a jackknife resampling method to obtain a variance estimate. A Monte Carlo simulation study will be designed and implemented to evaluate the performance of our proposed method. The methodology will be applied to real life applications in small area estimation and nonresponse adjustment projects, such as poverty mapping.

Our research will enable researchers to make valid statistical inferences based on a probabilistically linked data set with more observations or a wider range of variables. Our proposal is appealing as it avoids costly new data collection operations for the purpose of performing a variety of statistical analysis. 

Title: Surveys, Public Opinion, and Democracy: A Youth Education Initiative
Author: Allyson Holbrook, University of Illinois at Chicago
Abstract: The proposed initiative addresses negative public attitudes toward surveys. This challenge is central to survey research and is an underlying cause of other issues for survey researchers (decreased response rates, increased survey costs, and concerns about data quality). The proposed initiative is premised on two assumptions. First, childhood and adolescent socialization affects adult opinions and behaviors. Second, campaigns to change public attitudes on a large scale are expensive and often unsuccessful, and may result in polarization rather than mass change.

This initiative is involves two educational programs to educate students about surveys and their role in democracy. The materials developed for these programs will link surveys and survey methodology to the core disciplines of social studies, science, and mathematics. The first program involves developing a K-12 curriculum about surveys and survey methodology. This curriculum would involve materials for teachers and projects that can be implemented in age-appropriate ways to learn about surveys (e.g., students designing and conducting a survey of the student body). The second proposed program involves developing materials for an interactive presentation and workshop about survey methods that could be given by a guest to school-aged audiences. A corps of volunteer survey professionals would be recruited and trained to use these materials. The presentation would then be marketed to schools with the goal of asking having volunteers give a small number of presentations at schools in their geographic area. In addition, an online version of the presentation would be recorded and made available for classes for whom an in person presentation was not possible. 

The proposed approach to developing these two programs involves five steps that can be done in parallel for each of the two programs: (1) Assess what K-12 students in the U.S. currently learn about surveys and how these programs might best be integrated into the current curricula by examining current curricula and conducting focus groups with teachers of different grade levels and disciplines; (2) Develop age-appropriate curriculum and presentation materials with the help of experienced teachers and curriculum experts; (3) Pretest both the curriculum and the presentation materials in volunteer partner schools; (4) Revise the curriculum and presentation materials in response to the pretesting experience; and (5) Promote and distribute information about both programs to schools.

The goal of the proposed initiative is to educate children and adolescents about surveys to help them be citizens who understand the role of surveys in government and policy-making and are able to distinguish scientific surveys and non-scientific surveys (e.g., selling, fund-raising, or push polls). The ultimately goal of the proposed initiative is to contribute to a public that understands and values surveys.

This initiative could benefit the broader survey research community by making the public (and policy-makers) more intelligent consumers of surveys and survey results. It may also enhance current university programs that provide training in survey methods by introducing students to surveys at an earlier age. This program would also enhance the relationship between survey research professionals across the country and their communities.

Title: Public Opinion in Space and Time: A Geospatial View of Public Attitudes Towards Surveys
Authors: Sarah Kelley, Celeste Stone, Mark Masterton and Clyde Tucker, American Institutes for Research
Abstract: Does everyone trust surveys equally? A common assumption in survey research is that the populations that respond to surveys are not meaningfully different than those who do not respond to surveys—or are only different in ways that can be identified and controlled for. But with growing divides in attitudes towards privacy, government, and science as well as and changing immigration policies that may frighten minority communities, those assumptions may no longer hold and it is imperative that we understand to develop actionable insights into changing public attitudes towards survey research in order to improve our ability to overcome nonresponse bias. We propose to conduct a sophisticated geospatial analysis – blending survey, administrative and social media data – to understand differences in public trust in surveys across time and space.

Our vision for this analysis builds on our existing work using deep learning (specifically, Long Short-Term Memory or LSTM networks) to predict trends over time using data from varied data sources. This advanced technical approach allows us to combine diverse datasets—including unstructured social media and news data—into a highly accurate model to identify trends over time  (and yearly cycles) in public trust in surveys as well as regional variations in these patterns by repeatedly updating our the information we collect from diverse sources.

Our vision for this analysis also has a very practical application: create a real-time tool for use by survey practitioners. We see this tool being of use at several different junctures in the survey lifecycle to help address the crisis in public trust in surveys in three core ways:

  1. It will allow survey practitioners to identify and correct for regional variations in response rates. For example, major changes to immigration policy may deter Hispanic populations in certain areas from participating in data collection effort that are used to ensure that resources are allocated equitably.
  2. It will help survey practitioners who administer surveys during extended field periods select ideal times or plan for differential response rates across time. Field directors can then plan their surveys around ideal times or apply statistical corrections to known response issues.
  3. It will identify communities and areas where trust in surveys is particularly low, allowing for further study of the issue.

Our presentation will begin with a brief executive-level summary of the current state of public opinions about science and surveys. We will also present some of our methodological work that shows how incorporating GIS into analyses can reveal different patterns and reduce (or at least identify) inequities in data and analyses. We will then present the core idea including the use cases, target audience, and business model. As time and resources allow, we will also demo wire-frames and/or a beta version of the tool. In addition to answering audience questions during the Q&A session, our presenters have found that it can be helpful to engage the audience directly in providing feedback and collecting data to define new use cases during a presentation.

Title: Trust us: Leveraging a more nuanced understanding of trust in survey research
Author: Colleen McClain, University of Michigan
Abstract: Given growing suspicion of research, declining response rates, and increasingly political interpretation of survey data, the study of (dis)trust and its role in mediating survey response is growing in urgency. Particularly if individuals systematically do not respond due to lack of trust, and if trust attitudes are likely to be related to outcomes of interest, their role in generating nonresponse bias may become concerning.
Encouraging trust in surveys and statistics requires an understanding of why and how individuals allocate credibility. Existing research in this area is limited in its assessment of how trust dimensions potentially related to the survey request—trust, for example, in statistical products/institutions, in technology and data protection, or in groups or institutions that the survey raises to consciousness—vary across groups. Trust in statistics, for example, is itself multidimensional, and its transferability across groups has rarely been examined—despite Fellegi’s (2010) identification of cultural factors as influential in its formation. 

Furthermore, for those without knowledge specific to a request, assessments of trust may be associated with considerations ranging from confidence in Congress (Hunter Childs, King, & Fobia, 2015) to general cultural values (OECD, 2010; Smirnova & Scanlon, 2017).  Whether or not existing nonresponse trends that differ across sociodemographic groups can be understood as manifestations of differential trust is thus unclear.  And given these potential nuances, appeals to increase such trust could easily fall on deaf ears or be discounted if not appropriately designed, especially via a cursory scan of a self-administered survey request. 
In this context, building on existing methodological work on measuring trust in surveys and statistics (e.g., Couper, Singer, & Conrad, 2008; Fellegi, 2010; Brackfield, 2011; Hunter Childs, Willson, Martinez, Rasmussen, & Wroblewski, 2012; Hunter Childs, King, & Fobia, 2015; Smirnova & Scanlon, 2017), the proposed research aims to:

  1. Identify and assess the role of key identities/socio-demographic groups that may relate to differences in trust levels and relevant considerations that comprise trust; as well as the potential for such trust to influence willingness to participate in (and provide quality data for) research, particularly in a public opinion setting;
  2. Examine the feasibility and effectiveness of a tailored design to mitigate trust concerns.
A combination of qualitative and quantitative methods will be used to accomplish these goals. First, acknowledging a bias toward willing survey respondents—missing the highly distrustful—aim 1 will be informed by community focus groups, followed by low-cost web data collection stratified by sociodemographic subgroups.  Data collection will administer trust batteries (including those from Fellegi’s framework) for equivalence testing across groups; test the relationship between group memberships, trust, and hypothetical agreement; and inform the proposal, testing, and evaluation of a protocol designed to counteract salient trust concerns via the request for participation. The proposed design will seek to engage respondents, address concerns, and provide a mechanism for seeking information—with overall goals of fostering trust and reducing bias.

Title: Automated Retrieval of Information From Open-Ended Survey Responses Using Natural Language Processing
Authors: Antonia Warren, Reanne Townsend, Hanyu Sun, David Cantor, Andrew Caporaso, and Gonzalo Rivero, Westat
Abstract: The public’s willingness to participate in surveys has continued to decline over the years. The cause of this decline is complex and includes distrust of pollsters, the emergence of telemarketing, a lack of understanding about how surveys work, and confidentiality and privacy concerns (Kim et al. 2010). All of these factors contribute to the public’s distrust of surveys. However, research shows that despite this the public still believes that surveys are fundamentally useful (Kim et al. 2010). There is hope that if survey researchers can improve the survey experience for respondents, attitudes about surveys and participation rates may improve as well. The Westat team proposes to leverage emergent technologies in surveys to improve respondent experiences and reduce burden. In particular, the Westat team proposes to use Natural Language Processing (NLP) to automatically retrieve information from open-ended survey responses. As an example, we suggest applying this method to survey interviews on crime victimization. However, this method could be expanded to other contexts where it might be easier to ask what happened rather than a long series of closed-ended questions (e.g., expenditures).

In surveys like the National Crime Victimization Survey, respondents are asked screener questions about general types of victimization, and upon an affirmative response are asked a series of questions about details of the victimization. Respondents frequently offer spontaneous descriptions of the incidents before the follow-up questions are asked, and therefore are often asked redundant or inapplicable questions. This long interview process can be burdensome to respondents, especially when describing potentially distressing events. However, this issue could be alleviated by using NLP. In practice, respondents’ incident narratives would be collected during in-person interviews, then NLP would be used to parse the text and retrieve incident characteristics (e.g. if a weapon was used, the victims relationship to the perpetrator, or gender of the perpetrator) in real-time during the survey. The retrieved information would be used to pre-fill responses to closed-ended follow-up questions about incident details. This will prevent the respondent from being asked about certain incident characteristics twice, and repeating (potentially sensitive) information that has already been conveyed. Standardized probes will be used during the open-ended response to elicit specific information about the incident characteristics. The NLP analysis process would include a measure of confidence in the classification of the incident characteristics. Essentially, we will develop a statistical model that predicts the probability of the characteristic being correct based on attributes of the narrative and characteristic itself. At the conclusion of the open-ended incident description, the respondent would be asked to review and revise any retrieved incident characteristics that have a low level of confidence. Finally, any incident characteristics questions that could not be answered at all by retrieving information from the narrative, will be asked as closed ended questions at the end of the survey. This approach has great potential for improving survey experiences, and therefore positively changing the public attitudes towards the survey, reducing item nonresponse, and increasing data quality.