Accounting for non-response bias using participation incentives and survey design: An application using gift vouchers

Standard corrections for missing data rely on the strong and generally untestable assumption of missing at random. Heckman-type selection models relax this assumption, but have been criticized because they typically require a selection variable which predicts non-response but not the outcome of interest, and can impose bivariate normality. In this paper we illustrate an application using a copula methodology which does not rely on bivariate normality. We implement this approach in data on HIV testing at a demographic surveillance site in rural South Africa which are affected by non-response. Randomized incentives are the ideal selection variable, particularly when implemented ex ante to deal with potential missing data. However, elements of survey design may also provide a credible method of correcting for non-response bias ex post. For example, although not explicitly randomized, allocation of food gift vouchers during our survey was plausibly exogenous and substantially raised participation, as did effective survey interviewers. Based on models with receipt of a voucher and interviewer identity as selection variables, our results imply that 37% of women in the population under study are HIV positive, compared to imputation-based estimates of 28%. For men, confidence intervals are too wide to reject the absence of non-response bias. Consistent results obtained when comparing different selection variables and error structures strengthen these conclusions. Our application illustrates the feasibility of the selection model approach when combined with survey metadata.

[1]  T. Bärnighausen,et al.  High Coverage of ART Associated with Decline in Risk of HIV Acquisition in Rural KwaZulu-Natal, South Africa , 2013, Science.

[2]  David Madden,et al.  Sample selection versus two-part models revisited: the case of female smoking and drinking. , 2008, Journal of health economics.

[3]  F. Vella,et al.  Estimation of marginal effects in semiparametric selection models with binary outcomes , 2015 .

[4]  Yoonjoung Choi,et al.  A systematic review of Demographic and Health Surveys: data availability and utilization for research. , 2012, Bulletin of the World Health Organization.

[5]  D. Bloom,et al.  Does the AIDS epidemic threaten economic growth , 1997 .

[6]  Eric J Tchetgen Tchetgen,et al.  A general instrumental variable framework for regression analysis with outcome missing not at random , 2017, Biometrics.

[7]  J Ties Boerma,et al.  Estimates of HIV-1 prevalence from national population-based surveys as a new gold standard , 2003, The Lancet.

[8]  O. Attanasio,et al.  Differential Mortality and Wealth Accumulation , 1995 .

[9]  J. Strauss,et al.  Cutting the costs of attrition: Results from the Indonesia Family Life Survey. , 2012, Journal of development economics.

[10]  T. Bärnighausen,et al.  HIV status and participation in HIV surveillance in the era of antiretroviral treatment: a study of linked population-based and clinical data in rural South Africa , 2012, Tropical medicine & international health : TM & IH.

[11]  Finis Welch,et al.  What Do We Really Know about Wages? The Importance of Nonreporting and Census Imputation , 1986, Journal of Political Economy.

[12]  Shihti Yu,et al.  On the choice between sample selection and two-part models , 1996 .

[13]  C. Manski Nonparametric Bounds on Treatment Effects , 1989 .

[14]  James L. Powell,et al.  Semiparametric Estimation of Selection Models: Some Empirical Results , 1990 .

[15]  H. James VARIETIES OF SELECTION BIAS , 1990 .

[16]  S. Clark,et al.  Validation, Replication, and Sensitivity Testing of Heckman-Type Selection Models to Adjust Estimates of HIV Prevalence , 2014, PloS one.

[17]  James E. Prieger,et al.  A flexible parametric selection model for non‐normal data with application to health care usage , 2002 .

[18]  Giuseppe De Luca,et al.  SNP and SML Estimation of Univariate and Bivariate Binary-Choice Models: , 2008 .

[19]  A. Gallant,et al.  Semi-nonparametric Maximum Likelihood Estimation , 1987 .

[20]  Marcia M. A. Schafgans,et al.  ON INTERCEPT ESTIMATION IN THE SAMPLE SELECTION MODEL , 2000, Econometric Theory.

[21]  Claudia Pigini Bivariate Non-Normality in the Sample Selection Model , 2014 .

[22]  W. V. D. Ven,et al.  The demand for deductibles in private health insurance: A probit model with sample selection , 1981 .

[23]  W. Greene Sample Selection Bias as a Specification Error: Comment , 1981 .

[24]  David S. Lee Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects , 2005 .

[25]  Susann Rohwedder,et al.  Methodological Innovations in Collecting Spending Data: The HRS Consumption and Activities Mail Survey. , 2008, Fiscal studies.

[26]  Masao Nakamura,et al.  On the Relationships among Several Specification Error Tests Presented by Durbin, Wu, and Hausman , 1981 .

[27]  E. Chirwa,et al.  The Short-Term Impacts of a Schooling Conditional Cash Transfer Program on the Sexual Behavior of Young Women , 2009, Health Economics.

[28]  C. Beyrer,et al.  Expanding the Space: Inclusion of Most-at-Risk Populations in HIV Prevention, Treatment, and Care Services , 2011, Journal of acquired immune deficiency syndromes.

[29]  V. Mishra,et al.  Evaluation of bias in HIV seroprevalence estimates from national household surveys , 2008, Sexually Transmitted Infections.

[30]  G. Davey,et al.  Implications of the HIV testing protocol for refusal bias in seroprevalence surveys , 2009, BMC public health.

[31]  Franco Peracchi,et al.  Using panel data for partial identification of human immunodeficiency virus prevalence when infection status is missing not at random , 2014 .

[32]  N. McGrath,et al.  Individual, household and community factors associated with HIV test refusal in rural Malawi , 2008, Tropical medicine & international health : TM & IH.

[33]  D. Hill,et al.  Reducing Panel Attrition: A Search for Effective Policy Instruments , 2001 .

[34]  R. Winkelmann Copula bivariate probit models: with an application to medical expenditures. , 2012, Health economics.

[35]  Whitney K. Newey,et al.  Nonparametric Estimation of Sample Selection Models , 2003 .

[36]  Murray D Smith,et al.  Modeling Sample Selection Using Archimedean Copulas , 2003 .

[37]  Donald B. Rubin,et al.  Combining Panel Data Sets with Attrition and Refreshment Samples , 1998 .

[38]  D. Walque How Does the Impact of an HIV/AIDS Information Campaign Vary with Educational Attainment? Evidence from Rural Uganda , 2004 .

[39]  J. Eaton,et al.  Refusal bias in HIV prevalence estimates from nationally representative seroprevalence surveys , 2009, AIDS.

[40]  E. Vytlacil Independence, Monotonicity, and Latent Index Models: An Equivalence Result , 2002 .

[41]  D. Dancer,et al.  Infant mortality and child nutrition in Bangladesh. , 2008, Health economics.

[42]  P. Aggleton,et al.  HIV and AIDS-related stigma and discrimination: a conceptual framework and implications for action. , 2003, Social science & medicine.

[43]  J. Friedman,et al.  What Does Variation in Survey Design Reveal About the Nature of Measurement Errors in Household Consumption? , 2013 .

[44]  D. Conniffe,et al.  Efficient Probit Estimation with Partially Missing Covariates , 2009, SSRN Electronic Journal.

[45]  J. Wooldridge Inverse probability weighted estimation for general missing data problems , 2004 .

[46]  Ofer Harel,et al.  Are We Missing the Importance of Missing Values in HIV Prevention Randomized Clinical Trials? Review and Recommendations , 2012, AIDS and Behavior.

[47]  T. F. Rinke de Wit,et al.  Refusal Bias in the Estimation of HIV Prevalence , 2014, Demography.

[48]  Cheti Nicoletti,et al.  Nonresponse in dynamic panel data models , 2006 .

[49]  A. Case,et al.  HIV Risk and Adolescent Behaviors in Africa. , 2013, The American economic review.

[50]  J. Powell,et al.  Semiparametric estimation of censored selection models with a nonparametric selection mechanism , 1993 .

[51]  Luc Behaghel,et al.  Please Call Again: Correcting Nonresponse Bias in Treatment Effect Models , 2015, Review of Economics and Statistics.

[52]  J. Heckman Sample selection bias as a specification error , 1979 .

[53]  D. Canning,et al.  Do gifts increase consent to home-based HIV testing? A difference-in-differences study in rural KwaZulu-Natal, South Africa , 2016, International journal of epidemiology.

[54]  Till Bärnighausen,et al.  Cohort Profile: Africa Centre Demographic Information System (ACDIS) and population-based HIV survey , 2007, International journal of epidemiology.

[55]  Rosalba Radice,et al.  A Simultaneous Equation Approach to Estimating HIV Prevalence With Nonignorable Missing Responses , 2017 .

[56]  T. Bärnighausen,et al.  Participation Dynamics in Population-Based Longitudinal HIV Surveillance in Rural South Africa , 2015, PloS one.

[57]  J. Horowitz,et al.  Identification and estimation of statistical functionals using incomplete data , 2006 .

[58]  Donald W. K. Andrews,et al.  Semiparametric Estimation of the Intercept of a Sample Selection Model , 1998 .

[59]  A. Case,et al.  The Impact of the AIDS Pandemic on Health Services in Africa: Evidence from Demographic and Health Surveys , 2009, Demography.

[60]  On Inferring Demand for Health Care in the Presence of Anchoring and Selection Biases , 2009 .

[61]  D. Canning,et al.  Adjusting HIV prevalence estimates for non-participation: an application to demographic surveillance , 2015, Journal of the International AIDS Society.

[62]  F. Obare Nonresponse in repeat population-based voluntary counseling and testing for HIV in rural Malawi , 2010, Demography.

[63]  James P. Smith,et al.  Lost but Not Forgotten: Attrition and Follow-up in the Indonesia Family Life Survey , 2001 .

[64]  José M. R. Murteira,et al.  Health care utilization and self-assessed health: specification of bivariate models using copulas , 2011 .

[65]  Eike Christian Brechmann,et al.  Modeling Dependence with C- and D-Vine Copulas: The R Package CDVine , 2013 .

[66]  Adam J. Berinsky,et al.  Designing Surveys to Account for Endogenous Non-Response , 2018 .