论文信息 - Data preprocessing and mortality prediction: The Physionet/CinC 2012 challenge revisited

Data preprocessing and mortality prediction: The Physionet/CinC 2012 challenge revisited

The Physionet/CinC 2012 challenge focused on improving patient specific mortality predictions in the intensive care unit. While most of the focus in the challenge was on applying sophisticated machine learning algorithms, little attention was paid to the preprocessing performed on the data a priori. We compare four standard pre-processing methods with a novel Box-Cox outlier rejection technique and analyze their effect on machine learning classifiers for predicting the mortality of ICU patients. The best machine learning model utilized the proposed preprocessing method and achieved an AUROC of 0.848. In general, the AUROC of models using our novel preprocessing method increased, and this increase was as much as 0.02 in some cases. Furthermore, the use of preprocessing improved the performance of regression models to a higher level than that of non-linear techniques such as random forests. We demonstrate that proper preprocessing of the data prior to use in a prognostic model can significantly improve performance. This improvement can be even greater than that provided by more complex non-linear machine learning algorithms.

Gari D. Clifford | Alistair E. W. Johnson | Andrew A. Kramer | G. Clifford | A. Kramer

[1] D. Cox,et al. An Analysis of Transformations , 1964 .

[2] Gari D. Clifford,et al. A New Severity of Illness Scale Using a Subset of Acute Physiology and Chronic Health Evaluation Data Elements Shows Comparable Predictive Accuracy* , 2013, Critical care medicine.

[3] Thomas Higgins,et al. SAPS 3--From evaluation of the patient to evaluation of the intensive care unit. Part 2: Development of a prognostic model for hospital mortality at ICU admission. , 2005 .

[4] J. le Gall,et al. SAPS 3—From evaluation of the patient to evaluation of the intensive care unit. Part 1: Objectives, methods and cohort description , 2005, Intensive Care Medicine.

[5] G. Moody,et al. Predicting in-hospital mortality of ICU patients: The PhysioNet/Computing in cardiology challenge 2012 , 2012, 2012 Computing in Cardiology.

[6] Jeffrey M. Hausdorff,et al. Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[7] M. Tolcott. Biomedical engineering. , 1972, Science.

[8] D. Teres,et al. Assessing contemporary intensive care unit outcome: An updated Mortality Probability Admission Model (MPM0-III)* , 2007, Critical care medicine.

[9] J. Zimmerman,et al. Acute Physiology and Chronic Health Evaluation (APACHE) IV: Hospital mortality assessment for today’s critically ill patients* , 2006, Critical care medicine.