A Bayesian Semiparametric Approach to Intermediate Variables in Causal Inference

In causal inference studies, treatment comparisons often need to be adjusted for confounded post-treatment variables. Principal stratification (PS) is a framework to deal with such variables within the potential outcome approach to causal inference. Continuous intermediate variables introduce inferential challenges to PS analysis. Existing methods either dichotomize the intermediate variable, or assume a fully parametric model for the joint distribution of the potential intermediate variables. However, the former is subject to information loss and arbitrary choice of the cutoff point and the latter is often inadequate to represent complex distributional and clustering features. We propose a Bayesian semiparametric approach that consists of a flexible parametric model for the potential outcomes and a Bayesian nonparametric model for the potential intermediate outcomes using a Dirichlet process mixture (DPM) model. The DPM approach provides flexibility in modeling the possibly complex joint distribution of the potential intermediate outcomes and offers better interpretability of results through its clustering feature. Gibbs sampling based posterior inference is developed. We illustrate the method by two applications: one concerning partial compliance in a randomized clinical trial, and one concerning the causal mechanism between physical activity, body mass index, and cardiovascular disease in the observational Swedish National March Cohort study.

[1]  Fernando A. Quintana,et al.  Nonparametric Bayesian data analysis , 2004 .

[2]  S. MacEachern Estimating normal means with a conjugate style dirichlet process prior , 1994 .

[3]  H. Adami,et al.  Measures of physical activity and their correlates: The Swedish National March Cohort , 2009, European Journal of Epidemiology.

[4]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[5]  P. Rosenbaum The Consequences of Adjustment for a Concomitant Variable that Has Been Affected by the Treatment , 1984 .

[6]  D. Basu Randomization Analysis of Experimental Data: The Fisher Randomization Test , 1980 .

[7]  Stephen G. Walker,et al.  Sampling the Dirichlet Mixture Model with Slices , 2006, Commun. Stat. Simul. Comput..

[8]  B. Efron,et al.  Compliance as an Explanatory Variable in Clinical Trials , 1991 .

[9]  F Mealli,et al.  Application of the Principal Stratification Approach to the Faenza Randomized Experiment on Breast Self‐Examination , 2007, Biometrics.

[10]  D. Rubin,et al.  Bayesian inference for causal effects in randomized experiments with noncompliance , 1997 .

[11]  J. Rosenthal,et al.  Markov Chain Monte Carlo , 2018 .

[12]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[13]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[14]  Francesco Bartolucci,et al.  Modeling Partial Compliance Through Copulas in a Principal Stratification Framework , 2011 .

[15]  Juni Palmgren,et al.  Sensitivity Analysis for Principal Stratum Direct Effects, with an Application to a Study of Physical Activity and Coronary Heart Disease , 2009, Biometrics.

[16]  D. Rubin [On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9.] Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies , 1990 .

[17]  Joshua D. Angrist,et al.  Identification of Causal Effects Using Instrumental Variables , 1993 .

[18]  Donald B. Rubin,et al.  Public Schools Versus Private Schools: Causal Inference With Partial Compliance , 2009 .

[19]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[20]  D. Rubin,et al.  Principal Stratification in Causal Inference , 2002, Biometrics.

[21]  D. Rubin Randomization Analysis of Experimental Data: The Fisher Randomization Test Comment , 1980 .

[22]  Donald B Rubin,et al.  Principal Stratification for Causal Inference With Extended Partial Compliance , 2008 .

[23]  J. Robins,et al.  Identifiability and Exchangeability for Direct and Indirect Effects , 1992, Epidemiology.

[24]  Donald B. Rubin,et al.  Bayesian Inference for Causal Effects: The Role of Randomization , 1978 .

[25]  Xiao-Li Meng,et al.  POSTERIOR PREDICTIVE ASSESSMENT OF MODEL FITNESS VIA REALIZED DISCREPANCIES , 1996 .

[26]  T. Ferguson Prior Distributions on Spaces of Probability Measures , 1974 .

[27]  Y. Lagerros Physical activity from the epidemiological perspective : Measurement issues and health effects , 2006 .

[28]  G. Roberts,et al.  Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models , 2007, 0710.4228.

[29]  Jerome P. Reiter,et al.  Sensitivity analysis for unmeasured confounding in principal stratification settings with binary variables , 2012, Statistics in medicine.

[30]  W. Gilks Markov Chain Monte Carlo , 2005 .

[31]  D. Gordon,et al.  Lipid Research Clinics Coronary Primary Prevention Trial , 1986 .

[32]  Donald B. Rubin,et al.  Bayesian Inference for Causal Effects , 2005 .

[33]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[34]  H. Ishwaran,et al.  Markov chain Monte Carlo in approximate Dirichlet and beta two-parameter process hierarchical models , 2000 .

[35]  KyungMann Kim,et al.  Contrasting treatment‐specific survival using double‐robust estimators , 2012 .

[36]  D. Rubin Causal Inference Through Potential Outcomes and Principal Stratification: Application to Studies with “Censoring” Due to Death , 2006, math/0612783.

[37]  Donald B. Rubin,et al.  Comment : Neyman ( 1923 ) and Causal Inference in Experiments and Observational Studies , 2007 .

[38]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[39]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[40]  M. Escobar Estimating Normal Means with a Dirichlet Process Prior , 1994 .