Discovery and inclusion of SOFA score episodes in mortality prediction

Predicting the survival status of Intensive Care patients at the end of their hospital stay is useful for various clinical and organizational tasks. Current models for predicting mortality use logistic regression models that rely solely on data collected during the first 24h of patient admission. These models do not exploit information contained in daily organ failure scores which nowadays are being routinely collected in many Intensive Care Units. We propose a novel method for mortality prediction that, in addition to admission-related data, takes advantage of daily data as well. The method is characterized by the data-driven discovery of temporal patterns, called episodes, of the organ failure scores and by embedding them in the familiar logistic regression framework for prediction. Our method results in a set of D logistic regression models, one for each of the first D days of Intensive Care Unit stay. A model for day d<or=D is trained on the patient subpopulation that stayed at least d days in the Intensive Care Unit and predicts the probability of death at the end of hospital stay for such patients. We implemented our method, with a specific form of episodes, called aligned episodes, on a large dataset of Intensive Care Unit patients for the first 5 days of stay (D=5) in the unit. We compared our models with ones that were developed on the same patient subpopulations but which did not use the episodes. The new models show improved performance on each of the five days. They also provide insight in the effect of the various selected episodes on mortality.

[1]  Jianyong Wang,et al.  Mining sequential patterns by pattern-growth: the PrefixSpan approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[2]  Nicolette de Keizer,et al.  Integrating classification trees with local logistic regression in Intensive Care prognosis , 2003, Artif. Intell. Medicine.

[3]  Jean-François Boulicaut,et al.  Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases , 2004 .

[4]  Frank E. Harrell,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2001 .

[5]  A. Akhmetova Discovery of Frequent Episodes in Event Sequences , 2006 .

[6]  David J. Hand,et al.  Construction and Assessment of Classification Rules , 1997 .

[7]  L. Ohno-Machado Journal of Biomedical Informatics , 2001 .

[8]  A. Abu-Hanna,et al.  Prognostic Models in Medicine , 2001, Methods of Information in Medicine.

[9]  David R. Anderson,et al.  Model Selection and Multimodel Inference , 2003 .

[10]  J. L. Gall,et al.  APACHE II--a severity of disease classification system. , 1986, Critical care medicine.

[11]  J. Solsona,et al.  Multicenter study of the multiple organ dysfunction syndrome in intensive care units: the usefulness of Sequential Organ Failure Assessment scores in decision making , 2005, Intensive Care Medicine.

[12]  J. Vincent,et al.  Serial evaluation of the SOFA score to predict outcome in critically ill patients. , 2001, JAMA.

[13]  Francis Cantraine,et al.  Use of the Sequential Organ Failure Assessment score as a severity score , 2005, Intensive Care Medicine.

[14]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[15]  Yuval Shahar,et al.  A Framework for Knowledge-Based Temporal Abstraction , 1997, Artif. Intell..

[16]  C. Sprung,et al.  Use of the SOFA score to assess the incidence of organ dysfunction/failure in intensive care units: results of a multicenter, prospective study. Working group on "sepsis-related problems" of the European Society of Intensive Care Medicine. , 1998, Critical care medicine.

[17]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[18]  D. McClish,et al.  How Well Can Physicians Estimate Mortality in a Medical Intensive Care Unit? , 1989, Medical decision making : an international journal of the Society for Medical Decision Making.

[19]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[20]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[21]  C. Sprung,et al.  The use of maximum SOFA score to quantify organ dysfunction/failure in intensive care. Results of a prospective, multicentre study , 1999, Intensive Care Medicine.

[22]  S. Lemeshow,et al.  A new Simplified Acute Physiology Score (SAPS II) based on a European/North American multicenter study. , 1993, JAMA.

[23]  Arno Siebes,et al.  Constructing (Almost) Phylogenetic Trees from Developmental Sequences Data , 2004, PKDD.

[24]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[25]  Riccardo Bellazzi,et al.  Temporal data mining for the quality assessment of hemodialysis services , 2005, Artif. Intell. Medicine.

[26]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[27]  Gilles Clermont,et al.  Predicting with Variables Constructed from Temporal Sequences , 2001, AISTATS.

[28]  Manuel Filipe Santos,et al.  Mortality assessment in intensive care units via adverse events using artificial neural networks , 2006, Artif. Intell. Medicine.

[29]  Gilles Clermont,et al.  Predicting ICU mortality: a comparison of stationary and nonstationary temporal models , 2000, AMIA.

[30]  Ron Kohavi,et al.  Error-Based and Entropy-Based Discretization of Continuous Features , 1996, KDD.

[31]  J.-L. Vincent,et al.  Evaluation of organ failure: we are making progress , 2000, Intensive Care Medicine.

[32]  James A Russell,et al.  Early changes in organ function predict eventual survival in severe sepsis* , 2005, Critical care medicine.

[33]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[34]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.