MR‐BOIL: Causal inference in one‐sample Mendelian randomization for binary outcome with integrated likelihood method

Mendelian randomization is a statistical method for inferring the causal relationship between exposures and outcomes using an economics-derived instrumental variable approach. The research results are relatively complete when both exposures and outcomes are continuous variables. However, due to the noncollapsing nature of the logistic model, the existing methods inherited from the linear model for exploring binary outcome cannot take the effect of confounding factors into account, which leads to biased estimate of the causal effect. In this article, we propose an integrated likelihood method MR-BOIL to investigate causal relationships for binary outcomes by treating confounders as latent variables in one-sample Mendelian randomization. Under the assumption of a joint normal distribution of the confounders, we use expectation maximization algorithm to estimate the causal effect. Extensive simulations demonstrate that the estimator of MR-BOIL is asymptotically unbiased and that our method improves statistical power without inflating type I error rate. We then apply this method to analyze the data from Atherosclerosis Risk in Communications Study. The results show that MR-BOIL can better identify plausible causal relationships with high reliability, compared with the unreliable results of existing methods. MR-BOIL is implemented in R and the corresponding R code is provided for free download.

[1]  Xiaotong Shen,et al.  Constrained maximum likelihood-based Mendelian randomization robust to both correlated and uncorrelated pleiotropic effects. , 2021, American journal of human genetics.

[2]  H. Tiwari,et al.  A novel Mendelian randomization method with binary risk factor and outcome , 2021, Genetic epidemiology.

[3]  Siqi Xu,et al.  MRCIP: a robust Mendelian randomization method accounting for correlated and idiosyncratic pleiotropy , 2021, Briefings Bioinform..

[4]  Anqi Wang,et al.  A two‐sample robust Bayesian Mendelian Randomization method accounting for linkage disequilibrium and idiosyncratic pleiotropy with applications to the COVID‐19 outcomes , 2021, medRxiv.

[5]  Dylan S. Small,et al.  Statistical inference in two-sample summary-data Mendelian randomization using robust adjusted profile score , 2018, The Annals of Statistics.

[6]  B. Neale,et al.  Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases , 2018, Nature Genetics.

[7]  Dylan S. Small,et al.  A review of instrumental variable estimators for Mendelian randomization , 2015, Statistical methods in medical research.

[8]  Robert W Platt,et al.  Studying noncollapsibility of the odds ratio with marginal structural and logistic regression models , 2016, Statistical methods in medical research.

[9]  Alan M. Kwong,et al.  Next-generation genotype imputation service and methods , 2016, Nature Genetics.

[10]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[11]  T. Kitagawa A Test for Instrument Validity , 2015 .

[12]  James Y. Dai,et al.  Mendelian randomization studies for a continuous exposure under case-control sampling. , 2015, American journal of epidemiology.

[13]  Carson C Chow,et al.  Second-generation PLINK: rising to the challenge of larger and richer datasets , 2014, GigaScience.

[14]  Stephen Burgess,et al.  Lack of Identification in Semiparametric Instrumental Variable Models With Binary Outcomes , 2014, American journal of epidemiology.

[15]  Dylan S. Small,et al.  Instrumental Variables Estimation With Some Invalid Instruments and its Application to Mendelian Randomization , 2014, 1401.5755.

[16]  S. Burgess Identifying the odds ratio estimated by a two‐stage instrumental variable analysis with a logistic regression model , 2013, Statistics in medicine.

[17]  A. Butterworth,et al.  Mendelian Randomization Analysis With Multiple Genetic Variants Using Summarized Data , 2013, Genetic epidemiology.

[18]  S. Thompson,et al.  Use of allele scores as instrumental variables for Mendelian randomization , 2013, International journal of epidemiology.

[19]  S. Purcell,et al.  Pleiotropy in complex traits: challenges and strategies , 2013, Nature Reviews Genetics.

[20]  Tom M Palmer,et al.  Severity of bias of a simple estimator of the causal odds ratio in Mendelian randomization studies , 2013, Statistics in medicine.

[21]  A. Butterworth,et al.  Use of Mendelian randomisation to assess potential benefit of clinical intervention , 2012, BMJ : British Medical Journal.

[22]  Stephen Burgess,et al.  Improving bias and coverage in instrumental variable analysis with weak instruments for continuous and binary outcomes , 2012, Statistics in medicine.

[23]  Raj Chetty,et al.  Identification and Inference With Many Invalid Instruments , 2011 .

[24]  S. Vansteelandt,et al.  On Instrumental Variables Estimation of Causal Odds Ratios , 2011, 1201.2487.

[25]  Dylan S Small,et al.  Two‐stage instrumental variable methods for estimating the causal odds ratio: Analysis of bias , 2011, Statistics in medicine.

[26]  S. Vansteelandt,et al.  Mendelian randomization analysis of case‐control data using structural mean models , 2011, Statistics in medicine.

[27]  Alexander J. Smola,et al.  Parallelized Stochastic Gradient Descent , 2010, NIPS.

[28]  Frank Windmeijer,et al.  Instrumental Variable Estimators for Binary Outcomes , 2009 .

[29]  M. Tobin,et al.  Adjusting for bias and unmeasured confounding in Mendelian randomization studies with binary responses. , 2008, International journal of epidemiology.

[30]  Paul J Rathouz,et al.  Two-stage residual inclusion estimation: addressing endogeneity in health econometric modeling. , 2008, Journal of health economics.

[31]  George Davey Smith,et al.  Mendelian randomization: Using genes as instruments for making causal inferences in epidemiology , 2008, Statistics in medicine.

[32]  Dylan S. Small,et al.  Sensitivity Analysis for Instrumental Variables Regression With Overidentifying Restrictions , 2007 .

[33]  J. Robins,et al.  Instruments for Causal Inference: An Epidemiologist's Dream? , 2006, Epidemiology.

[34]  S. Ebrahim,et al.  'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? , 2003, International journal of epidemiology.

[35]  Jonathan H. Wright,et al.  A Survey of Weak Instruments and Weak Identification in Generalized Method of Moments , 2002 .

[36]  S. Nielsen The stochastic EM algorithm: estimation and asymptotic results , 2000 .

[37]  Jinyong Hahn,et al.  A New Specification Test for the Validity of Instrumental Variables , 2000 .

[38]  J. Pearl,et al.  Confounding and Collapsibility in Causal Inference , 1999 .

[39]  G. Celeux,et al.  Stochastic versions of the em algorithm: an experimental study in the mixture case , 1996 .

[40]  Joshua D. Angrist,et al.  Identification of Causal Effects Using Instrumental Variables , 1993 .

[41]  J. Angrist,et al.  Identification and Estimation of Local Average Treatment Effects , 1994 .

[42]  P. Holland CAUSAL INFERENCE, PATH ANALYSIS AND RECURSIVE STRUCTURAL EQUATIONS MODELS , 1988 .

[43]  Robert Tibshirani,et al.  Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy , 1986 .

[44]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[45]  J. Sargan THE ESTIMATION OF ECONOMIC RELATIONSHIPS USING INSTRUMENTAL VARIABLES , 1958 .

[46]  T. W. Anderson,et al.  Estimation of the Parameters of a Single Equation in a Complete System of Stochastic Equations , 1949 .

[47]  A. Wald The Fitting of Straight Lines if Both Variables are Subject to Error , 1940 .