Sepsis Prediction using Continuous and Categorical Features on Sporadic Data

Sepsis is one of the most prevalent causes of mortality in Intensive Care Units (ICUs) and also one of the most expensive health-care problems. Delayed treatment is associated with increase in death and financial burden. This work proposes an early prediction of sepsis validated on Physionet Challenge 2019 dataset. The challenge is to extract continuous, categorical and domain-specific discriminating features from highly sporadic lab data and vital signals. We find that the imputation of extremely isolated data lower the prediction performance. In order to mitigate this, we use a sliding window on sporadic data to generate continuous features which capture the trend. We also devise a binning approach to generate categorical features from the aperiodic data in order to discriminate the deviation from normalcy. Lastly, we observe that a logical fusion of Random Forest and Logit Boost provides optimal performance. Normalized Utility Score (NUS) is used to benchmark the performance of the proposed baselines. Five-fold cross-validation of the best preforming pipeline across the data reveals high median NUS of 0.401.

[1]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[2]  Anthony Fiore,et al.  Varying Estimates of Sepsis Mortality Using Death Certificates and Administrative Codes--United States, 1999-2014. , 2016, MMWR. Morbidity and mortality weekly report.

[3]  T. Rea,et al.  Assessment of Clinical Criteria for Sepsis: For the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). , 2016, JAMA.

[4]  Shamim Nemati,et al.  Early Prediction of Sepsis From Clinical Data: The PhysioNet/Computing in Cardiology Challenge 2019 , 2019, 2019 Computing in Cardiology (CinC).

[5]  J. Vincent,et al.  The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure , 1996, Intensive Care Medicine.

[6]  Jonathan R Edwards,et al.  The impact of antimicrobial-resistant, health care-associated infections on mortality in the United States. , 2008, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[7]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Cao Tao,et al.  The significance of National Early Warning Score for predicting prognosis and evaluating conditions of patients in resuscitation room , 2018 .

[9]  F. Harrell,et al.  Abnormal Heart Rate Characteristics Preceding Neonatal Sepsis and Sepsis-Like Illness , 2003, Pediatric Research.

[10]  R. Bellomo,et al.  The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). , 2016, JAMA.

[11]  Karen C Carroll,et al.  Early Identification and Treatment of Pathogens in Sepsis: Molecular Diagnostics and Antibiotic Choice. , 2016, Clinics in chest medicine.

[12]  S. Lemeshow,et al.  A new Simplified Acute Physiology Score (SAPS II) based on a European/North American multicenter study , 1993 .

[13]  Hyun Kang The prevention and handling of the missing data , 2013, Korean journal of anesthesiology.

[14]  Fethi Gül,et al.  Changing Definitions of Sepsis. , 2017, Turkish journal of anaesthesiology and reanimation.