Automated Meta-Analysis: A Causal Learning Perspective

Meta-analysis is a systematic approach for understanding a phenomenon by analyzing the results of many previously published experimental studies. It is central to deriving conclusions about the summary effect of treatments and interventions in medicine, poverty alleviation, and other applications with social impact. Unfortunately, metaanalysis involves great human effort, rendering a process that is extremely inefficient and vulnerable to human bias. To overcome these issues, we work toward automating meta-analysis with a focus on controlling for risks of bias. In particular, we first extract information from scientific publications written in natural language. From a novel causal learning perspective, we then propose to frame automated meta-analysis – based on the input of the first step – as a multiple-causal-inference problem where the summary effect is obtained through intervention. Built upon existing efforts for automating the initial steps of meta-analysis, the proposed approach achieves the goal of automated meta-analysis and largely reduces the human effort involved. Evaluations on synthetic and semi-synthetic datasets show that this approach can yield promising results.

[1]  A. Lenzi,et al.  Is chronic inhibition of phosphodiesterase type 5 cardioprotective and safe? A meta-analysis of randomized controlled trials , 2014, BMC Medicine.

[2]  Lane F Burgette,et al.  A tutorial on propensity score estimation for multiple treatments using generalized boosted models , 2013, Statistics in medicine.

[3]  J. Pearl Causal inference in statistics: An overview , 2009 .

[4]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[5]  David M. Blei,et al.  The Blessings of Multiple Causes , 2018, Journal of the American Statistical Association.

[6]  D. Rubin,et al.  Reducing Bias in Observational Studies Using Subclassification on the Propensity Score , 1984 .

[7]  Donald B. Rubin,et al.  Comment : Neyman ( 1923 ) and Causal Inference in Experiments and Observational Studies , 2007 .

[8]  J. Sterne,et al.  The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials , 2011, BMJ : British Medical Journal.

[9]  John D. Storey,et al.  Testing for genetic associations in arbitrarily structured populations , 2014, Nature Genetics.

[10]  G. Imbens,et al.  Comment on: “The Blessings of Multiple Causes” by Yixin Wang and David M. Blei , 2019, Journal of the American Statistical Association.

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  Skipper Seabold,et al.  Statsmodels: Econometric and Statistical Modeling with Python , 2010, SciPy.

[13]  A B Haidich,et al.  Meta-analysis in medical research. , 2010, Hippokratia.

[14]  Jennifer L. Hill,et al.  Bayesian Nonparametric Modeling for Causal Inference , 2011 .

[15]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[16]  Byron C. Wallace,et al.  RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials , 2015, J. Am. Medical Informatics Assoc..

[17]  P. Holland Statistics and Causal Inference , 1985 .

[18]  Roee Gutman,et al.  Estimation of causal effects of binary treatments in unconfounded studies , 2015, Statistics in medicine.

[19]  Matthew Michelson Automating Meta-Analyses of Randomized Clinical Trials: A First Look , 2014, AAAI Fall Symposia.

[20]  Eric J. Tchetgen Tchetgen,et al.  Comment on “Blessings of Multiple Causes” , 2019, Journal of the American Statistical Association.

[21]  D. Basu Randomization Analysis of Experimental Data: The Fisher Randomization Test , 1980 .

[22]  M. McMullen,et al.  A unified mixed-model method for association mapping that accounts for multiple levels of relatedness , 2006, Nature Genetics.

[23]  Rajeev Dehejia,et al.  Propensity Score-Matching Methods for Nonexperimental Causal Studies , 2002, Review of Economics and Statistics.

[24]  Dustin Tran,et al.  Implicit Causal Models for Genome-wide Association Studies , 2017, ICLR.

[25]  Ruocheng Guo,et al.  A Survey of Learning Causality with Data , 2018, ACM Comput. Surv..

[26]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[27]  D. Sackett,et al.  Evidence based medicine: what it is and what it isn't , 1996, BMJ.

[28]  J. Rassen,et al.  Simultaneously assessing intended and unintended treatment effects of multiple treatment options: a pragmatic “matrix design” , 2011, Pharmacoepidemiology and drug safety.

[29]  Judea Pearl,et al.  The seven tools of causal inference, with reflections on machine learning , 2019, Commun. ACM.

[30]  J. Hilden,et al.  Impact of allocation concealment on conclusions drawn from meta-analyses of randomized trials. , 2007, International journal of epidemiology.

[31]  M. Lechner Identification and Estimation of Causal Effects of Multiple Treatments Under the Conditional Independence Assumption , 1999, SSRN Electronic Journal.

[32]  Elaine L. Zanutto,et al.  Using Propensity Score Subclassification for Multiple Treatment Doses to Evaluate a National Antidrug Media Campaign , 2005 .

[33]  Adler J. Perotte,et al.  Multiple Causal Inference with Latent Confounding , 2018, ArXiv.

[34]  D. Moher,et al.  Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. , 2010, International journal of surgery.

[35]  M. S. Patel,et al.  An introduction to meta-analysis. , 1989, Health Policy.

[36]  B. Smaill,et al.  A Machine Learning Aided Systematic Review and Meta-Analysis of the Relative Risk of Atrial Fibrillation in Patients With Diabetes Mellitus , 2018, Front. Physiol..

[37]  D. V. Lindley,et al.  Randomization Analysis of Experimental Data: The Fisher Randomization Test Comment , 1980 .