Machine Learning to Predict Mortality and Critical Events in COVID-19 Positive New York City Patients

Coronavirus 2019 (COVID-19), caused by the SARS-CoV-2 virus, has become the deadliest pandemic in modern history, reaching nearly every country worldwide and overwhelming healthcare institutions. As of April 20, there have been more than 2.4 million confirmed cases with over 160,000 deaths. Extreme case surges coupled with challenges in forecasting the clinical course of affected patients have necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods for achieving this are lacking. In this paper, we present a decision tree-based machine learning model trained on electronic health records from patients with confirmed COVID-19 at a single center within the Mount Sinai Health System in New York City. We then externally validate our model by predicting the likelihood of critical event or death within various time intervals for patients after hospitalization at four other hospitals and achieve strong performance, notably predicting mortality at 1 week with an AUC-ROC of 0.84. Finally, we establish model interpretability by calculating SHAP scores to identify decisive features, including age, inflammatory markers (procalcitonin and LDH), and coagulation parameters (PT, PTT, D-Dimer). To our knowledge, this is one of the first models with external validation to both predict outcomes in COVID-19 patients with strong validation performance and identification of key contributors in outcome prediction that may assist clinicians in making effective patient management decisions.

Riccardo Miotto | Kipp W. Johnson | Fayzan F. Chaudhry | Akhil Vaid | Shan Zhao | Matteo Danieletto | Eddye Golden | Ishan Paranjpe | Patricia Glowe | Bethany Percha | Allan C Just | Prem Timsina | Judy H Cho | Eric E Schadt | Andrew Kasarskis | Carol R Horowitz | Zahi A Fayad | Sulaiman Somani | Jessica K De Freitas | Kipp W Johnson | Alexander W Charney | Girish N Nadkarni | Patricia Kovatch | Carlos Cordon-Cardo | Emilia Bagiella | Nidhi Naik | Judy H. Cho | Valentin Fuster | Joseph Finkelstein | Benjamin S Glicksberg | Kodi B. Arfer | Anuradha Lala | Arash Kia | A. Just | A. Kasarskis | V. Fuster | M. Danieletto | E. Schadt | E. Nestler | C. Cordon-Cardo | Z. Fayad | D. Charney | D. Reich | E. Argulian | E. Golden | B. Percha | E. Bottinger | B. Glicksberg | J. Narula | R. Miotto | G. Nadkarni | N. Beckmann | A. Charney | P. Kovatch | J. Aberg | P. Timsina | I. Paranjpe | A. Lala | E. Bagiella | A. Kia | M. Levin | C. Horowitz | S. Somani | P. Glowe | R. Freeman | Dennis S Charney | Eric J Nestler | A. Vaid | A. Russak | Jagat Narula | M. Paranjpe | Judith A Aberg | J. D. De Freitas | F. Chaudhry | Noam D Beckmann | Samuel J Lee | Erwin P Bottinger | David L Reich | Robert M Freeman | Matthew A Levin | Adam J Russak | Fayzan F Chaudhry | Kodi Arfer | Manish Paranjpe | Manbir Singh | Dara Meyer | Paul F O’Reilly | Laura H Huckins | Edgar Argulian | Barbara Murphy | Nidhi Naik | Manbir Singh | Laura Huckins | Barbara Murphy | J. Finkelstein | D. Meyer | Paul F. O’Reilly | Shan P Zhao | Sulaiman S Somani | Riccardo Miotto | Patricia Glowe | Judy H. Cho | Laura M. Huckins

[1]  C. Tondo,et al.  Cardiac and arrhythmic complications in patients with COVID‐19 , 2020, Journal of cardiovascular electrophysiology.

[2]  Jianfeng Zhang,et al.  Clinical characteristics of 3062 COVID‐19 patients: A meta‐analysis , 2020, Journal of medical virology.

[3]  D. Melady,et al.  Age is just a number – and so is frailty: Strategies to inform resource allocation during the COVID-19 pandemic , 2020, CJEM.

[4]  Ashitha L. Vijayan,et al.  Procalcitonin: a promising diagnostic marker for sepsis and antibiotic therapy , 2017, Journal of Intensive Care.

[5]  Li Tan,et al.  Lymphopenia predicts disease severity of COVID-19: a descriptive and predictive study , 2020, Signal Transduction and Targeted Therapy.

[6]  G. Chowell,et al.  Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship, Yokohama, Japan, 2020 , 2020, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[7]  R. Henderson,et al.  Usefulness of lactate dehydrogenase and its isoenzymes as indicators of lung damage or inflammation. , 1996, The European respiratory journal.

[8]  A. Davis,et al.  Management of Critically Ill Adults With COVID-19. , 2020, JAMA.

[9]  Kun Wang,et al.  Clinical and Laboratory Predictors of In-Hospital Mortality in 305 Patients with COVID-19: A Cohort Study in Wuhan, China , 2020 .

[10]  G. Fonarow,et al.  The Role of Data Registries in the Time of COVID-19. , 2020, Circulation. Cardiovascular quality and outcomes.

[11]  G. Onder,et al.  Case-Fatality Rate and Characteristics of Patients Dying in Relation to COVID-19 in Italy. , 2020, JAMA.

[12]  Edward Livingston,et al.  Coronavirus Disease 2019 (COVID-19) in Italy. , 2020, JAMA.

[13]  Theodora Psaltopoulou,et al.  Hematological findings and complications of COVID‐19 , 2020, American journal of hematology.

[14]  R Witzig,et al.  The Medicalization of Race: Scientific Legitimization of a Flawed Social Construct , 1996, Annals of Internal Medicine.

[15]  A. Singer,et al.  Staying Ahead of the Wave , 2020, The New England journal of medicine.

[16]  Achim Zeileis,et al.  Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.

[17]  M. Kreiner,et al.  Genes, race, and culture in clinical care: racial profiling in the management of chronic illness. , 2013, Medical anthropology quarterly.

[18]  G. Lippi,et al.  The role of red blood cell distribution width (RDW) in cardiovascular risk assessment: useful or hype? , 2019, Annals of translational medicine.

[19]  Mario Plebani,et al.  Hematologic, biochemical and immune biomarker abnormalities associated with severe illness and mortality in coronavirus disease 2019 (COVID-19): a meta-analysis , 2020, Clinical chemistry and laboratory medicine.

[20]  G. Heinze,et al.  Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal , 2020, BMJ.

[21]  Xiaowei Yan,et al.  Coagulopathy and Antiphospholipid Antibodies in Patients with Covid-19 , 2020, The New England journal of medicine.

[22]  T. Tabuchi,et al.  Coronavirus Disease , 2021, Encyclopedia of the UN Sustainable Development Goals.

[23]  Ji-yang Liu,et al.  Lymphopenia acted as an adverse factor for severity in patients with COVID-19: a single-centered, retrospective study , 2020 .

[24]  K. Yuen,et al.  Clinical Characteristics of Coronavirus Disease 2019 in China , 2020, The New England journal of medicine.

[25]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[26]  Lei Dong,et al.  Kidney disease is associated with in-hospital death of patients with COVID-19 , 2020, Kidney International.

[27]  Jing Yuan,et al.  Clinical and biochemical indexes from 2019-nCoV infected patients linked to viral loads and lung injury , 2020, Science China Life Sciences.

[28]  D. Gommers,et al.  Incidence of thrombotic complications in critically ill ICU patients with COVID-19 , 2020, Thrombosis Research.

[29]  Yaling Shi,et al.  A Tool to Early Predict Severe Corona Virus Disease 2019 (COVID-19) : A Multicenter Study using the Risk Nomogram in Wuhan and Guangdong, China , 2020, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[30]  Qiurong Ruan,et al.  Correction to: Clinical predictors of mortality due to COVID-19 based on an analysis of data of 150 patients from Wuhan, China , 2020, Intensive Care Medicine.

[31]  Qiurong Ruan,et al.  Clinical predictors of mortality due to COVID-19 based on an analysis of data of 150 patients from Wuhan, China , 2020, Intensive Care Medicine.

[32]  W. Gong,et al.  Association of Cardiac Injury With Mortality in Hospitalized Patients With COVID-19 in Wuhan, China. , 2020, JAMA cardiology.