Cluster‐Specific Variable Selection for Product Partition Models

type="main" xml:id="sjos12151-abs-0001"> We propose a random partition model that implements prediction with many candidate covariates and interactions. The model is based on a modified product partition model that includes a regression on covariates by favouring homogeneous clusters in terms of these covariates. Additionally, the model allows for a cluster-specific choice of the covariates that are included in this evaluation of homogeneity. The variable selection is implemented by introducing a set of cluster-specific latent indicators that include or exclude covariates. The proposed model is motivated by an application to predicting mortality in an intensive care unit in Lisboa, Portugal.

[1]  H. Chipman,et al.  Bayesian CART Model Search , 1998 .

[2]  Adrian F. M. Smith,et al.  A Bayesian CART algorithm , 1998 .

[3]  C. Geraldes,et al.  Generalized Linear Models, Generalized Additive Models and Neural Networks: Comparative Study in Medical Applications , 2013 .

[4]  Peter D. Hoff,et al.  Subset Clustering of Binary Sequences, with an Application to Genomic Abnormality Data , 2005, Biometrics.

[5]  S. Lemeshow,et al.  Mortality Probability Models (MPM II) based on an international cohort of intensive care unit patients. , 1993, JAMA.

[6]  H. Chipman,et al.  BART: Bayesian Additive Regression Trees , 2008, 0806.3286.

[7]  Peter Bauer,et al.  SAPS 3—From evaluation of the patient to evaluation of the intensive care unit. Part 2: Development of a prognostic model for hospital mortality at ICU admission , 2005, Intensive Care Medicine.

[8]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[9]  P. Müller,et al.  Random Partition Models with Regression on Covariates. , 2010, Journal of statistical planning and inference.

[10]  Peter Müller,et al.  A Product Partition Model With Regression on Covariates , 2011, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[11]  Edward I. George,et al.  The Practical Implementation of Bayesian Model Selection , 2001 .

[12]  J. Zimmerman,et al.  Acute Physiology and Chronic Health Evaluation (APACHE) IV: Hospital mortality assessment for today’s critically ill patients* , 2006, Critical care medicine.

[13]  D. Dunson,et al.  Nonparametric Bayes Conditional Distribution Modeling With Variable Selection , 2009, Journal of the American Statistical Association.

[14]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[15]  Xavier Robin,et al.  pROC: an open-source package for R and S+ to analyze and compare ROC curves , 2011, BMC Bioinformatics.

[16]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .