Unraveling Patient Heterogeneity in ICU With Deep Embedded Clustering Using Co-morbidity, Clinical Examination, and Laboratory Data

Introduction Despite extensive research, the goal of unravelling patient heterogeneity in critical care remains largely unattained. Combining clustering analysis of routinely collected high-frequency data with the identification of features driving cluster separation may constitute a step towards improving patient characterization. Methods In this study, we analysed prospectively collected data from 743 patients including co-morbidities, clinical examination, and laboratory parameters. We compared four clustering methodologies – deep embedded clustering (DEC), hierarchical clustering with and without dynamic time warping, and k-means – and trained a classifier to predict and validate cluster membership. The contribution of different variables to the predicted cluster membership was assessed using SHapley Additive exPlanations values.Results DEC yielded better results compared to the traditional clustering algorithms, with the best Jaccard and entropy scores being achieved for 6 clusters. These clusters were characterized as medium to high co-morbidity patients with respiratory pathology and sepsis (cluster 1), patients with primarily acute and chronic cardiac conditions and surgical admission (cluster 2), patients with diverse disease etiology and poor outcomes (cluster 3), low co-morbidity neurological, neurosurgical, and trauma patients (cluster 4), medium co-morbidity patients with cardio-respiratory problems, and neuro-trauma patients with longer length of stay (cluster 5), and patients with sepsis and respiratory infections (cluster 6). All clusters differed in in-ICU, 30-day, and 90-day mortality, as well as incidence of acute kidney injury, and two clusters were categorized as having higher mortality risk, and one cluster as lower mortality risk. Conclusions This machine learning methodology, which we made publicly available, is a possible solution to challenges previously encountered by clustering analyses, and may help unravel patient heterogeneity in critical care.

[1]  S. Chevret,et al.  Clinical phenotypes of critically ill COVID-19 patients , 2020, Intensive Care Medicine.

[2]  Søren Brunak,et al.  Dynamic and explainable machine learning prediction of mortality in patients in the intensive care unit: a retrospective study of high-frequency data in electronic patient records. , 2020, The Lancet. Digital health.

[3]  S. Brunak,et al.  Survival prediction in intensive-care units based on aggregation of long-term disease history and acute physiology: a retrospective study of the Danish National Patient Registry and electronic patient records. , 2019, The Lancet. Digital health.

[4]  E. Bleecker,et al.  Multiview Cluster Analysis Identifies Variable Corticosteroid Response Phenotypes in Severe Asthma. , 2019, American journal of respiratory and critical care medicine.

[5]  A. Perner,et al.  The use of clustering algorithms in critical care research to unravel patient heterogeneity , 2019, Intensive Care Medicine.

[6]  G. Geri,et al.  Cardiovascular clusters in septic shock combining clinical and echocardiographic parameters: a post hoc analysis , 2019, Intensive Care Medicine.

[7]  Kevin L. Delucchi,et al.  Latent class analysis of ARDS subphenotypes: a secondary analysis of the statins for acutely injured lungs from sepsis (SAILS) study , 2018, Intensive Care Medicine.

[8]  Scott M. Lundberg,et al.  Explainable machine-learning predictions for the prevention of hypoxaemia during surgery , 2018, Nature Biomedical Engineering.

[9]  M. Nijsten,et al.  Systematic comparison of routine laboratory measurements with in-hospital mortality: ICU-Labome, a large cohort study of critically ill patients , 2018, Clinical chemistry and laboratory medicine.

[10]  S. Mukhopadhyay,et al.  Novel diabetes subgroups. , 2018, The lancet. Diabetes & endocrinology.

[11]  Parisa Rashidi,et al.  DeepSOFA: A Continuous Acuity Score for Critically Ill Patients using Clinically Interpretable Deep Learning , 2018, Scientific Reports.

[12]  H. Snieder,et al.  Clinical examination, critical care ultrasonography and outcomes in the critically ill: cohort profile of the Simple Intensive Care Studies-I , 2017, BMJ Open.

[13]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[14]  Ali Farhadi,et al.  Unsupervised Deep Embedding for Clustering Analysis , 2015, ICML.

[15]  Pierre Gançarski,et al.  A global averaging method for dynamic time warping, with applications to clustering , 2011, Pattern Recognit..

[16]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[17]  Christian Hennig,et al.  Cluster-wise assessment of cluster stability , 2007, Comput. Stat. Data Anal..

[18]  M. Meilă Comparing clusterings---an information based distance , 2007 .

[19]  Douglas B. Kell,et al.  Computational cluster validation in post-genomic data analysis , 2005, Bioinform..

[20]  J. Vincent,et al.  Hypoalbuminemia in Acute Illness: Is There a Rationale for Intervention?: A Meta-Analysis of Cohort Studies and Controlled Trials , 2003, Annals of surgery.

[21]  J. Vincent,et al.  Anemia and blood transfusion in critically ill patients. , 2002, JAMA.

[22]  Michalis Vazirgiannis,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. On Clustering Validation Techniques , 2022 .

[23]  D. Yeshurun,et al.  Heart diseases affecting the liver and liver diseases affecting the heart. , 2000, American heart journal.

[24]  D. Milzman,et al.  ED use of rapid lactate to evaluate patients with acute chest pain. , 1997, Annals of emergency medicine.

[25]  P. Goldwasser,et al.  Association of serum albumin and mortality risk. , 1997, Journal of clinical epidemiology.

[26]  J. Sorkin,et al.  Serum albumin level and physical disability as predictors of mortality in older persons. , 1994, JAMA.