Parameterizing and Simulating from Causal Models

Many statistical problems in causal inference involve a probability distribution other than the one from which data are actually observed; as an additional complication, the object of interest is often a marginal quantity of this other probability distribution. This creates many practical complications for statistical inference, even where the problem is non-parametrically identified. Näıve attempts to specify a model parametrically can lead to unwanted consequences such as incompatible parametric assumptions or the so-called ‘g-null paradox’. As a consequence it is difficult to perform likelihood-based inference, or even to simulate from the model in a general way. We introduce the ‘frugal parameterization’, which places the causal effect of interest at its centre, and then build the rest of the model around it. We do this in a way that provides a recipe for constructing a regular, non-redundant parameterization using causal quantities of interest. In the case of discrete variables we use odds ratios to complete the parameterization, while in the continuous case we use copulas. Our methods allow us to construct and simulate from models with parametrically specified causal distributions, and fit them using likelihood-based methods, including fully Bayesian approaches. Models we can fit and simulate from exactly include marginal structural models and structural nested models. Our proposal includes parameterizations for the average causal effect and effect of treatment on the treated, as well as other causal quantities of interest. Our results will allow practitioners to assess their methods against the best possible estimators for correctly specified models, in a way which has previously been impossible.

[1]  Jessica G. Young,et al.  Simulation from a known Cox MSM using standard parametric models for the g‐formula , 2014, Statistics in medicine.

[2]  A. Philip Dawid,et al.  Identifying the consequences of dynamic treatment strategies: A decision-theoretic overview , 2010, ArXiv.

[3]  I. Pigeot,et al.  The IDEFICS cohort: design, characteristics and participation in the baseline survey , 2011, International Journal of Obesity.

[4]  M. Drton Likelihood ratio tests and singularities , 2007, math/0703360.

[5]  Abe Sklar,et al.  Random variables, joint distribution functions, and copulas , 1973, Kybernetika.

[6]  L. Rüschendorf Convergence of the iterative proportional fitting procedure , 1995 .

[7]  J. Robins A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , 1986 .

[8]  O. Barndorff-Nielsen Information and Exponential Families in Statistical Theory , 1980 .

[9]  S. Karlin,et al.  Classes of orderings of measures and related correlation inequalities. I. Multivariate totally positive distributions , 1980 .

[10]  J. Robins,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models , 1999 .

[11]  Jessica G. Young,et al.  Relation between three classes of structural models for the effect of a time-varying exposure on survival , 2010, Lifetime data analysis.

[12]  J. Robins,et al.  Alternative Graphical Causal Models and the Identification of Direct E!ects , 2010 .

[13]  Niels Keiding,et al.  Standardization and Control for Confounding in Observational Studies: A Historical Perspective , 2015, 1503.02853.

[14]  Hua Yun Chen A Semiparametric Odds Ratio Model for Measuring Association , 2007, Biometrics.

[15]  Olli Saarela,et al.  On Bayesian estimation of marginal structural models , 2015, Biometrics.

[16]  J. Robins,et al.  Correcting for non-compliance in randomized trials using rank preserving structural failure time models , 1991 .

[17]  Frank Windmeijer,et al.  Instrumental Variable Estimators for Binary Outcomes , 2009 .

[18]  Alan E Hubbard,et al.  Population intervention models in causal inference. , 2008, Biometrika.

[19]  J. Robins Estimation of the time-dependent accelerated failure time model in the presence of confounding factors , 1992 .

[20]  T. Bedford,et al.  Vines: A new graphical model for dependent random variables , 2002 .

[21]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[22]  Arlene K. H. Kim,et al.  Global rates of convergence in log-concave density estimation , 2014, 1404.2298.

[23]  TORBEN MARTINUSSEN,et al.  Instrumental variables estimation under a structural Cox model , 2019, Biostatistics.

[24]  T. Richardson Single World Intervention Graphs ( SWIGs ) : A Unification of the Counterfactual and Graphical Approaches to Causality , 2013 .

[25]  Elias Bareinboim,et al.  External Validity: From Do-Calculus to Transportability Across Populations , 2014, Probabilistic and Causal Inference.

[26]  T. Ferguson A Course in Large Sample Theory , 1996 .

[27]  James M. Robins,et al.  Invited Commentary Invited Commentary: Effect Modification by Time-varying Covariates American Journal of Epidemiology Advance Access Standard versus History-adjusted Marginal Structural Models Model Incompatibility in History-adjusted Marginal Structural Models Structural Nested Models versus Histo , 2006 .

[28]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[29]  Marvin N. Wright,et al.  Polygenic risk for obesity and its interaction with lifestyle and sociodemographic factors in European children and adolescents , 2021, International Journal of Obesity.

[30]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[31]  Vanessa Didelez,et al.  Defining causal meditation with a longitudinal mediator and a survival outcome , 2018, Lifetime data analysis.

[32]  B. Sturmfels,et al.  Maximum likelihood estimation for totally positive log‐concave densities , 2018, Scandinavian Journal of Statistics.

[33]  P. Diggle Analysis of Longitudinal Data , 1995 .

[34]  James M. Robins,et al.  Simulation from Structural Survival Models under Complex Time-Varying Data Structures , 2019 .

[35]  J. Robins,et al.  Estimation of treatment effects in randomised trials with non-compliance and a dichotomous outcome using structural mean models , 2004 .

[36]  James M. Robins,et al.  Estimation of Effects of Sequential Treatments by Reparameterizing Directed Acyclic Graphs , 1997, UAI.

[37]  S. Vansteelandt,et al.  Flexible Mediation Analysis in the Presence of Nonlinear Relations: Beyond the Mediation Formula , 2013, Multivariate behavioral research.

[38]  W. Newey,et al.  Semiparametric Efficiency Bounds , 1990 .

[39]  Stijn Vansteelandt,et al.  A simple unified approach for estimating natural direct and indirect effects. , 2012, American journal of epidemiology.

[40]  James M. Robins,et al.  Marginal Structural Models versus Structural nested Models as Tools for Causal inference , 2000 .

[41]  S. Vansteelandt,et al.  On Instrumental Variables Estimation of Causal Odds Ratios , 2011, 1201.2487.

[42]  W G Havercroft,et al.  Simulating from marginal structural models with time‐dependent confounding , 2012, Statistics in medicine.

[43]  P. Jacob,et al.  Unbiased Markov chain Monte Carlo methods with couplings , 2020, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[44]  I. Csiszár $I$-Divergence Geometry of Probability Distributions and Minimization Problems , 1975 .

[45]  James M. Robins,et al.  On Modeling and Estimation for the Relative Risk and Risk Difference , 2015, 1510.02430.

[46]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[47]  M. Sklar Fonctions de repartition a n dimensions et leurs marges , 1959 .

[48]  Wicher P. Bergsma,et al.  Marginal models for categorical data , 2002 .

[49]  Marshall M Joffe,et al.  History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimens , 2005 .

[50]  Eric J Tchetgen Tchetgen,et al.  On doubly robust estimation in a semiparametric odds ratio model. , 2010, Biometrika.