Mitigating unobserved spatial confounding when estimating the effect of supermarket access on cardiovascular disease deaths

Confounding by unmeasured spatial variables has received some attention in the spatial statistics and causal inference literatures, but concepts and approaches have remained largely separated. In this paper, we aim to bridge these distinct strands of statistics by considering unmeasured spatial confounding within a causal inference framework, and estimating effects using outcome regression tools popular within the spatial literature. First, we show how using spatially correlated random effects in the outcome model, an approach common among spatial statisticians, does not necessarily mitigate bias due to spatial confounding, a previously published but not universally known result. Motivated by the bias term of commonly-used estimators, we propose an affine estimator which addresses this deficiency. We discuss how unbiased estimation of causal parameters in the presence of unmeasured spatial confounding can only be achieved under an untestable set of assumptions which will often be application-specific. We provide a set of assumptions which describe how the exposure and outcome of interest relate to the unmeasured variables, and which is sufficient for identification of the causal effect based on the observed data. We examine identifiability issues through the lens of restricted maximum likelihood estimation in linear models, and implement our method using a fully Bayesian approach applicable to any type of outcome variable. This work is motivated by and used to estimate the effect of county-level limited access to supermarkets on the rate of cardiovascular disease deaths in the elderly across the whole continental United States. Even though standard approaches return null or protective effects, our approach uncovers evidence of unobserved spatial confounding, and indicates that limited supermarket access has a harmful effect on cardiovascular mortality.

[1]  C. Murray,et al.  Cigarette smoking prevalence in US counties: 1996-2012 , 2014, Population Health Metrics.

[2]  J. Hahn On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects , 1998 .

[3]  W. C. Wilson,et al.  San Diego surveyed for heart-healthy foods and exercise facilities. , 1986, Public health reports.

[4]  D. Rubin,et al.  Assessing Sensitivity to an Unobserved Binary Covariate in an Observational Study with Binary Outcome , 1983 .

[5]  Eric J. Tchetgen Tchetgen,et al.  Comment on “Blessings of Multiple Causes” , 2019, Journal of the American Statistical Association.

[6]  Chris C. Lim,et al.  Association between long‐term exposure to ambient air pollution and diabetes mortality in the US , 2018, Environmental research.

[7]  Rocío Titiunik,et al.  Enhancing a geographic regression discontinuity design through matching to estimate the effect of ballot initiatives on voter turnout , 2015 .

[8]  R. Little,et al.  Penalized Spline of Propensity Methods for Treatment Comparison , 2019, Journal of the American Statistical Association.

[9]  A. Gelfand,et al.  Proper multivariate conditional autoregressive models for spatial data analysis. , 2003, Biostatistics.

[10]  D. Rubin,et al.  Causal Inference for Statistics, Social, and Biomedical Sciences: Sensitivity Analysis and Bounds , 2015 .

[11]  J. Robins A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , 1986 .

[12]  J. Robins,et al.  Semiparametric Efficiency in Multivariate Regression Models with Missing Data , 1995 .

[14]  Mevin B. Hooten,et al.  Restricted spatial regression in practice: geostatistical models, confounding, and robustness under model misspecification , 2015 .

[15]  S. MacEachern,et al.  Bayesian Nonparametric Spatial Modeling With Dirichlet Process Mixing , 2005 .

[16]  Seung-Jean Kim,et al.  Condition‐number‐regularized covariance estimation , 2013, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[17]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[18]  Tyler J. VanderWeele,et al.  Sensitivity Analysis in Observational Research: Introducing the E-Value , 2017, Annals of Internal Medicine.

[19]  Yan Wang,et al.  Air Pollution and Mortality in the Medicare Population , 2017, The New England journal of medicine.

[20]  G. Imbens,et al.  The Propensity Score with Continuous Treatments , 2005 .

[21]  Murali Haran,et al.  Dimension reduction and alleviation of confounding for spatial generalized linear mixed models , 2010, 1011.6649.

[22]  Sw. Banerjee,et al.  Hierarchical Modeling and Analysis for Spatial Data , 2003 .

[23]  A. D. Diez Roux,et al.  Neighborhood characteristics associated with the location of food stores and food service places. , 2002, American journal of preventive medicine.

[24]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[25]  M. Brauer,et al.  Use of Satellite Observations for Long-Term Exposure Assessment of Global Concentrations of Fine Particulate Matter , 2014, Environmental health perspectives.

[26]  F. Dominici,et al.  Fine particulate air pollution and hospital admission for cardiovascular and respiratory diseases. , 2006, JAMA.

[27]  Natalya Verbitsky-Savitz,et al.  Causal Inference Under Interference in Spatial Settings: A Case Study Evaluating Community Policing Program in Chicago , 2012 .

[28]  S. Wood Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models , 2011 .

[29]  Duncan Lee,et al.  Controlling for unmeasured confounding and spatial misalignment in long‐term air pollution and health studies , 2014, Environmetrics.

[30]  Brian J Reich,et al.  Confounder selection via penalized credible regions , 2014, Biometrics.

[31]  A. Quyyumi,et al.  Living in Food Deserts and Adverse Cardiovascular Outcomes in Patients With Cardiovascular Disease , 2019, Journal of the American Heart Association.

[32]  Patrick M O'Malley,et al.  Associations between access to food stores and adolescent body mass index. , 2007, American journal of preventive medicine.

[33]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[34]  Barbara A Laraia,et al.  Proximity of supermarkets is positively associated with diet quality index for pregnancy. , 2004, Preventive medicine.

[35]  Corwin M Zigler,et al.  Characterizing population exposure to coal emissions sources in the United States using the HyADS model. , 2019, Atmospheric environment.

[36]  D. Rubin Randomization Analysis of Experimental Data: The Fisher Randomization Test Comment , 1980 .

[37]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[38]  Christopher J Paciorek,et al.  The importance of scale for spatial-confounding bias and precision of spatial regression estimators. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[39]  Thomas Kneib,et al.  Structural Equation Models for Dealing With Spatial Confounding , 2018 .

[40]  A. Belloni,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011, 1201.0224.

[41]  G. Martin,et al.  Association Between Living in Food Deserts and Cardiovascular Risk , 2017, Circulation. Cardiovascular quality and outcomes.

[42]  Corwin M Zigler,et al.  Uncertainty in Propensity Score Estimation: Bayesian Methods for Variable Selection and Model-Averaged Causal Effects , 2014, Journal of the American Statistical Association.

[43]  Alexander D'Amour,et al.  On Multi-Cause Causal Inference with Unobserved Confounding: Counterexamples, Impossibility, and Alternatives , 2019, ArXiv.

[44]  T. Blakely,et al.  The contextual effects of neighbourhood access to supermarkets and convenience stores on individual fruit and vegetable consumption , 2008, Journal of Epidemiology & Community Health.

[45]  J. Schwartz,et al.  Estimating the Causal Effect of Low Levels of Fine Particulate Matter on Hospitalization , 2017, Epidemiology.

[46]  M. Brauer,et al.  Global Estimates of Fine Particulate Matter using a Combined Geophysical-Statistical Method with Information from Satellites, Models, and Monitors. , 2016, Environmental science & technology.

[47]  J. Hodges,et al.  Adding Spatially-Correlated Errors Can Mess Up the Fixed Effect You Love , 2010 .

[48]  C. Carvalho,et al.  Regularization and Confounding in Linear Regression for Treatment Effect Estimation , 2016, 1602.02176.

[49]  Peter Congdon Assessing the Impact of Socioeconomic Variables on Small Area Variations in Suicide Outcomes in England , 2012, International journal of environmental research and public health.

[50]  Jennifer L. Hill,et al.  Bayesian Nonparametric Modeling for Causal Inference , 2011 .

[51]  Travis A. Smith,et al.  Access to Affordable and Nutritious Food-Measuring and Understanding Food Deserts and Their Consequences: Report to Congress , 2012 .

[52]  Jason A. Duan,et al.  Modeling Disease Incidence Data with Spatial and Spatio Temporal Dirichlet Process Mixtures , 2008, Biometrical journal. Biometrische Zeitschrift.

[53]  G. Parmigiani,et al.  High-Dimensional Confounding Adjustment Using Continuous Spike and Slab Priors. , 2017, Bayesian analysis.

[54]  Corwin M Zigler,et al.  Adjusting for unmeasured spatial confounding with distance adjusted propensity score matching , 2016, Biostatistics.

[55]  D. A. Kenny,et al.  The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. , 1986, Journal of personality and social psychology.

[56]  Sophia Rabe-Hesketh,et al.  Weakly Informative Prior for Point Estimation of Covariance Matrices in Hierarchical Models , 2015 .