Comparing Alternatives for Estimation from Nonprobability Samples

Three approaches to estimation from nonprobability samples are quasi-randomization, superpopulation modeling, and doubly robust estimation. In the first, the sample is treated as if it were obtained via a probability mechanism, but unlike in probability sampling, that mechanism is unknown. Pseudo selection probabilities of being in the sample are estimated by using the sample in combination with some external data set that covers the desired population. In the superpopulation approach, observed values of analysis variables are treated as if they had been generated by some model. The model is estimated from the sample and, along with external population control data, is used to project the sample to the population. The specific techniques are the same or similar to ones commonly employed for estimation from probability samples and include binary regression, regression trees, and calibration. When quasi-randomization and superpopulation modeling are combined, this is referred to as doubly robust estimation. This article reviews some of the estimation options and compares them in a series of simulation studies.

[1]  Robert Chambers,et al.  Analysis of survey data , 2003 .

[2]  Michael W. Link,et al.  Social Media in Public Opinion Research Executive Summary of the Aapor Task Force on Emerging Technologies in Public Opinion Research , 2014 .

[3]  R. Valliant,et al.  Survey Weights: A Step-by-step Guide to Calculation , 2017 .

[4]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[5]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[6]  J. Cohen Public Opinion in State Politics , 2006 .

[7]  J. Rao,et al.  Inference From Stratified Samples: Properties of the Linearization, Jackknife and Balanced Repeated Replication Methods , 1981 .

[8]  Michael R Elliott,et al.  A nonparametric method to generate synthetic populations to adjust for complex sampling design features. , 2014, Survey methodology.

[9]  Marie Davidian,et al.  Comment: Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data. , 2008, Statistical science : a review journal of the Institute of Mathematical Statistics.

[10]  Andrew Gelman,et al.  Bayesian Multilevel Estimation with Poststratification: State-Level Estimates from National Polls , 2004, Political Analysis.

[11]  Matthias Schonlau,et al.  Are 'Webographic' or Attitudinal Questions Useful for Adjusting Estimates from Web Surveys Using Propensity Scoring? , 2007 .

[12]  Roger Tourangeau,et al.  The Science of Web Surveys , 2013 .

[13]  Michael R. Elliott,et al.  Model-assisted calibration of non-probability sample survey data using adaptive LASSO , 2018 .

[14]  Matthias Schonlau,et al.  Options for Conducting Web Surveys , 2017 .

[15]  Richard Valliant,et al.  Finite population sampling and inference : a prediction approach , 2000 .

[16]  Simon Munzert,et al.  Estimating Constituency Preferences from Sparse Survey Data Using Auxiliary Geographic Information , 2011, Political Analysis.

[17]  Roberto Rigobon,et al.  The Billion Prices Project: Using Online Prices for Measurement and Research , 2016 .

[18]  Michael R Elliott,et al.  A two‐step semiparametric method to accommodate sampling weights in multiple imputation , 2016, Biometrics.

[19]  Michael R. Elliott,et al.  Inference for Nonprobability Samples , 2017 .

[20]  B. Highton,et al.  How Does Multilevel Regression and Poststratification Perform with Conventional National Surveys? , 2013, Political Analysis.

[21]  Changbao Wu,et al.  A Model-Calibration Approach to Using Complete Auxiliary Information From Survey Data , 2001 .

[22]  C. Särndal,et al.  Calibration Estimators in Survey Sampling , 1992 .

[23]  M. Davidian,et al.  Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data , 2009, Biometrika.

[24]  R. Little Survey Nonresponse Adjustments for Estimates of Means , 1986 .

[25]  Richard M. Royall,et al.  Variance Estimation in Finite Population Sampling , 1978 .

[26]  Andrew Gelman,et al.  Struggles with survey weighting and regression modeling , 2007, 0710.5005.

[27]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[28]  Zhiqiang Tan,et al.  A Distributional Approach for Causal Inference Using Propensity Scores , 2006 .

[29]  David M. Rothschild,et al.  Forecasting elections with non-representative polls , 2015 .

[30]  G. Kalton,et al.  The treatment of missing survey data , 1986 .

[31]  K. Wolter Introduction to Variance Estimation , 1985 .

[32]  R. Valliant,et al.  A comparison of variance estimators for poststratification to estimated control totals , 2010 .

[33]  M. Couper Is the sky falling? new technology, changing media, and the future of surveys , 2013 .

[34]  Carl-Erik Särndal,et al.  Model Assisted Survey Sampling , 1997 .

[35]  Sunghee Lee,et al.  Estimation for Volunteer Panel Web Surveys Using Propensity Score Adjustment and Calibration Adjustment , 2009 .

[36]  J. Robins,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models , 1999 .

[37]  R. Y. Shapiro Public Opinion and American Democracy , 2011 .

[38]  Frauke Kreuter,et al.  Practical Tools for Designing and Weighting Survey Samples , 2015 .

[39]  R. Valliant,et al.  General Regression Estimation Adjusted for Undercoverage and Estimated Control Totals , 2016 .