BiMM tree: a decision tree method for modeling clustered and longitudinal binary outcomes

Abstract Clustered binary outcomes are frequently encountered in clinical research (e.g. longitudinal studies). Generalized linear mixed models (GLMMs) for clustered endpoints have challenges for some scenarios (e.g. data with multi-way interactions and nonlinear predictors unknown a priori). We develop an alternative, data-driven method called Binary Mixed Model (BiMM) tree, which combines decision tree and GLMM within a unified framework. Simulation studies show that BiMM tree achieves slightly higher or similar accuracy compared to standard methods. The method is applied to a real dataset from the Acute Liver Failure Study Group.

[1]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[2]  J. Bernuau,et al.  [Fulminant and subfulminant viral hepatitis]. , 1990, La Revue du praticien.

[3]  M LeBlanc,et al.  Binary partitioning for continuous longitudinal data: categorizing a prognostic variable , 2002, Statistics in medicine.

[4]  Adam Kapelner,et al.  bartMachine: Machine Learning with Bayesian Additive Regression Trees , 2013, 1312.2171.

[5]  William M. Lee,et al.  Acute liver failure: Summary of a workshop , 2007, Hepatology.

[6]  Diane Lambert,et al.  Fitting Trees to Functional Data, with an Application to Time-of-Day Patterns , 1999 .

[7]  Seong Keon Lee,et al.  On generalized multivariate decision tree by using GEE , 2005, Comput. Stat. Data Anal..

[8]  M. G. Pittau,et al.  A weakly informative default prior distribution for logistic and other regression models , 2008, 0901.4011.

[9]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[10]  R. Wiesner,et al.  Model for end-stage liver disease (MELD) and allocation of donor livers. , 2003, Gastroenterology.

[11]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[12]  Ciprian M. Crainiceanu,et al.  Nonparametric Regression Methods for Longitudinal Data Analysis. Mixed-effects Modeling Approaches , 2007 .

[13]  V. Dorie,et al.  Mixed methods for mixed models , 2014 .

[14]  W. Loh,et al.  Regression trees for longitudinal and multiresponse data , 2012, 1209.4690.

[15]  Andreas H Kramer,et al.  Intensive care of patients with acute liver failure: Recommendations of the U.S. Acute Liver Failure Study Group , 2007, Critical care medicine.

[16]  G. De’ath MULTIVARIATE REGRESSION TREES: A NEW TECHNIQUE FOR MODELING SPECIES–ENVIRONMENT RELATIONSHIPS , 2002 .

[17]  Christopher Zorn,et al.  A Solution to Separation in Binary Response Models , 2005, Political Analysis.

[18]  K. Hornik,et al.  party : A Laboratory for Recursive Partytioning , 2009 .

[19]  Jeffrey S. Simonoff,et al.  RE-EM trees: a data mining approach for longitudinal and clustered data , 2011, Machine Learning.

[20]  Constantine J. Karvellas,et al.  Predicting Outcome on Admission and Post-Admission for Acetaminophen-Induced Acute Liver Failure Using Classification and Regression Tree Models , 2015, PloS one.

[21]  Denis Larocque,et al.  Multivariate trees for mixed outcomes , 2009, Comput. Stat. Data Anal..

[22]  H. Chipman,et al.  BART: Bayesian Additive Regression Trees , 2008, 0806.3286.

[23]  Denis Larocque,et al.  Mixed-effects random forest for clustered data , 2014 .

[24]  Stephen A. Mistler A SAS ® Macro for Applying Multiple Imputation to Multilevel Data , 2013 .

[25]  D. Vergani,et al.  The importance of immune dysfunction in determining outcome in acute liver failure. , 2008, Journal of hepatology.

[26]  J. Wakefield,et al.  Bayesian inference for generalized linear mixed models. , 2010, Biostatistics.

[27]  Denis Larocque,et al.  Mixed effects regression trees for clustered data , 2008 .

[28]  T. Therneau,et al.  An Introduction to Recursive Partitioning Using the RPART Routines , 2015 .

[29]  M. Segal Tree-Structured Methods for Longitudinal Data , 1992 .

[30]  R Williams,et al.  Early indicators of prognosis in fulminant hepatic failure. , 1989, Gastroenterology.

[31]  Valerie Durkalski,et al.  Development of a Model to Predict Transplant-free Survival of Patients With Acute Liver Failure. , 2016, Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association.

[32]  Wu Hulin,et al.  Nonparametric Regression Methods for Longitudinal Data Analysis: Mixed-Effects Modeling Approaches , 2006 .

[33]  Wei-Yin Loh,et al.  Fifty Years of Classification and Regression Trees , 2014 .