SCEHR: Supervised Contrastive Learning for Clinical Risk Prediction using Electronic Health Records

Contrastive learning has demonstrated promising performance in image and text domains either in a self-supervised or a supervised manner. In this work, we extend the supervised contrastive learning framework to clinical risk prediction problems based on longitudinal electronic health records (EHR). We propose a general supervised contrastive loss $\mathcal{L}$Contrastive Cross Entropy+$\lambda \mathcal{L}$Supervised Contrastive Regularizer for learning both binary classification (e.g. in-hospital mortality prediction) and multi-label classification (e.g. phenotyping) in a unified framework. Our supervised contrastive loss practices the key idea of contrastive learning, namely, pulling similar samples closer and pushing dissimilar ones apart from each other, simultaneously by its two components: $\mathcal{L}$Contrastive Cross Entropy tries to contrast samples with learned anchors which represent positive and negative clusters, and $\mathcal{L}$Supervised Contrastive Regularizer tries to contrast samples with each other according to their supervised labels. We propose two versions of the above supervised contrastive loss and our experiments on real-world EHR data demonstrate that our proposed loss functions show benefits in improving the performance of strong baselines and even state-of-the-art models on benchmarking tasks for clinical risk predictions. Our loss functions work well with extremely imbalanced data which are common for clinical risk prediction problems. Our loss functions can be easily used to replace (binary or multi-label) cross-entropy loss adopted in existing clinical predictive models. The Pytorch code is released at https://github.com/calvin-zcx/SCEHR.

[1]  Jimeng Sun,et al.  RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism , 2016, NIPS.

[2]  Jeffrey Dean,et al.  Scalable and accurate deep learning with electronic health records , 2018, npj Digital Medicine.

[3]  Fillia Makedon,et al.  A Survey on Contrastive Self-supervised Learning , 2020, Technologies.

[4]  Fei Wang,et al.  Deep learning for healthcare: review, opportunities and challenges , 2018, Briefings Bioinform..

[5]  Jimeng Sun,et al.  UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced Data , 2021, WWW.

[6]  Aram Galstyan,et al.  Multitask learning and benchmarking with clinical time series data , 2017, Scientific Data.

[7]  Olivier Bodenreider,et al.  The digital revolution in phenotyping , 2015, Briefings Bioinform..

[8]  Alan F. Smeaton,et al.  Contrastive Representation Learning: A Framework and Review , 2020, IEEE Access.

[9]  Jie Tang,et al.  Self-Supervised Learning: Generative or Contrastive , 2020, IEEE Transactions on Knowledge and Data Engineering.

[10]  Christopher D. Manning,et al.  Contrastive Learning of Medical Visual Representations from Paired Images and Text , 2020, MLHC.

[11]  David Sontag,et al.  Deep Contextual Clinical Prediction with Reverse Distillation , 2020, AAAI.

[12]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[13]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[14]  Yannis Kalantidis,et al.  Hard Negative Mixing for Contrastive Learning , 2020, NeurIPS.

[15]  Beliz Gunel,et al.  Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning , 2020, ICLR.

[16]  Jiangtao Wang,et al.  AdaCare: Explainable Clinical Health Status Representation Learning via Scale-Adaptive Feature Extraction and Recalibration , 2019, AAAI.

[17]  D. Kobak,et al.  Initialization is critical for preserving global data structure in both t-SNE and UMAP , 2021, Nature Biotechnology.

[18]  Jimeng Sun,et al.  StageNet: Stage-Aware Neural Networks for Health Risk Prediction , 2020, WWW.

[19]  Ce Liu,et al.  Supervised Contrastive Learning , 2020, NeurIPS.

[20]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[21]  Marinka Zitnik,et al.  Contrastive Learning Improves Critical Event Prediction in COVID-19 Patients , 2021, Patterns.

[22]  Samy Bengio,et al.  Large Scale Online Learning of Image Similarity Through Ranking , 2009, J. Mach. Learn. Res..

[23]  Le Song,et al.  GRAM: Graph-based Attention Model for Healthcare Representation Learning , 2016, KDD.

[24]  Fei Wang,et al.  AI in Health: State of the Art, Challenges, and Future Directions , 2019, Yearbook of Medical Informatics.

[25]  Yasha Wang,et al.  ConCare: Personalized Clinical Feature Embedding via Capturing the Healthcare Context , 2019, AAAI.

[26]  Julien Mairal,et al.  Unsupervised Learning of Visual Features by Contrasting Cluster Assignments , 2020, NeurIPS.

[27]  Andrew Zisserman,et al.  Self-supervised Co-training for Video Representation Learning , 2020, NeurIPS.

[28]  Andreas Spanias,et al.  Attend and Diagnose: Clinical Time Series Analysis using Attention Models , 2017, AAAI.

[29]  Fenglong Ma,et al.  HiTANet: Hierarchical Time-Aware Attention Networks for Risk Prediction on Electronic Health Records , 2020, KDD.

[30]  Leo Celi,et al.  Evaluating Progress on Machine Learning for Longitudinal Electronic Healthcare Data , 2020, ArXiv.

[31]  Dezhong Peng,et al.  Contrastive Clustering , 2021, AAAI.

[32]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[33]  Amir H. Payberah,et al.  Deep learning for electronic health records: A comparative review of multiple deep neural architectures , 2020, J. Biomed. Informatics.

[34]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[35]  Kazem Rahimi,et al.  BEHRT: Transformer for Electronic Health Records , 2019, Scientific Reports.

[36]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[37]  Davide Chicco,et al.  Siamese Neural Networks: An Overview , 2021, Artificial Neural Networks, 3rd Edition.