Early prediction of mortality risk among severe COVID-19 patients using machine learning

Abstract Background Coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection has been spreading globally. The number of deaths has increased with the increase in the number of infected patients. We aimed to develop a clinical model to predict the outcome of severe COVID-19 patients early. Methods Epidemiological, clinical, and first laboratory findings after admission of 183 severe COVID-19 patients (115 survivors and 68 nonsurvivors) from the Sino-French New City Branch of Tongji Hospital were used to develop the predictive models. Five machine learning approaches (logistic regression, partial least squares regression, elastic net, random forest, and bagged flexible discriminant analysis) were used to select the features and predict the patients' outcomes. The area under the receiver operating characteristic curve (AUROC) was applied to compare the models' performance. Sixty-four severe COVID-19 patients from the Optical Valley Branch of Tongji Hospital were used to externally validate the final predictive model. Results The baseline characteristics and laboratory tests were significantly different between the survivors and nonsurvivors. Four variables (age, high-sensitivity C-reactive protein level, lymphocyte count, and d-dimer level) were selected by all five models. Given the similar performance among the models, the logistic regression model was selected as the final predictive model because of its simplicity and interpretability. The AUROCs of the derivation and external validation sets were 0.895 and 0.881, respectively. The sensitivity and specificity were 0.892 and 0.687 for the derivation set and 0.839 and 0.794 for the validation set, respectively, when using a probability of death of 50% as the cutoff. The individual risk score based on the four selected variables and the corresponding probability of death can serve as indexes to assess the mortality risk of COVID-19 patients. The predictive model is freely available at https://phenomics.fudan.edu.cn/risk_scores/. Conclusions Age, high-sensitivity C-reactive protein level, lymphocyte count, and d-dimer level of COVID-19 patients at admission are informative for the patients' outcomes.

[1]  A. Wilder-Smith,et al.  Can we contain the COVID-19 outbreak with the same measures as for SARS? , 2020, The Lancet Infectious Diseases.

[2]  Zunyou Wu,et al.  Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention. , 2020, JAMA.

[3]  D. Lillicrap Disseminated intravascular coagulation in patients with 2019‐nCoV pneumonia , 2020, Journal of Thrombosis and Haemostasis.

[4]  Dengju Li,et al.  Abnormal coagulation parameters are associated with poor prognosis in patients with novel coronavirus pneumonia , 2020, Journal of Thrombosis and Haemostasis.

[5]  Ke Ma,et al.  Clinical characteristics of 113 deceased patients with coronavirus disease 2019: retrospective study , 2020, BMJ.

[6]  Steve Webb,et al.  COVID-19: a novel coronavirus and a novel challenge for critical care , 2020, Intensive Care Medicine.

[7]  D. Rajgor,et al.  The many estimates of the COVID-19 case fatality rate , 2020, The Lancet Infectious Diseases.

[8]  J. Xiang,et al.  Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study , 2020, The Lancet.

[9]  Max Kuhn,et al.  Applied Predictive Modeling , 2013 .

[10]  Taojiao Wang,et al.  Clinical and immunologic features in severe and moderate Coronavirus Disease 2019. , 2020, The Journal of clinical investigation.

[11]  P. Adab,et al.  Covid-19: risk factors for severe disease and death , 2020, BMJ.

[12]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[13]  D. Heymann,et al.  COVID-19: what is next for public health? , 2020, The Lancet.

[14]  C. Whittaker,et al.  Estimates of the severity of coronavirus disease 2019: a model-based analysis , 2020, The Lancet Infectious Diseases.

[15]  G. Onder,et al.  Case-Fatality Rate and Characteristics of Patients Dying in Relation to COVID-19 in Italy. , 2020, JAMA.

[16]  C. Eastin,et al.  Clinical Characteristics of Coronavirus Disease 2019 in China , 2020, The Journal of Emergency Medicine.

[17]  Y. Hu,et al.  Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China , 2020, The Lancet.

[18]  C. Weyand,et al.  Understanding immunosenescence to improve responses to vaccines , 2013, Nature Immunology.

[19]  Kiran Shekar,et al.  Planning and provision of ECMO services for severe ARDS during the COVID-19 pandemic and other outbreaks of emerging infectious diseases , 2020, The Lancet Respiratory Medicine.

[20]  Ting Yu,et al.  Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study , 2020, The Lancet Respiratory Medicine.

[21]  Centers for Disease Control and Prevention CDC COVID-19 Response Team Severe Outcomes Among Patients with Coronavirus Disease 2019 (COVID-19) — United States, February 12–March 16, 2020 , 2020, MMWR. Morbidity and mortality weekly report.

[22]  B. Cowling,et al.  Rational use of face masks in the COVID-19 pandemic , 2020, The Lancet Respiratory Medicine.

[23]  Robert C. Holte,et al.  Cost curves: An improved method for visualizing classifier performance , 2006, Machine Learning.

[24]  Taojiao Wang,et al.  Clinical and immunologic features in severe and moderate forms of Coronavirus Disease 2019 , 2020, medRxiv.

[25]  E. Ely,et al.  The immunopathogenesis of sepsis in elderly patients. , 2005, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.