DDL: Deep Dictionary Learning for Predictive Phenotyping

Predictive phenotyping is about accurately predicting what phenotypes will occur in the next clinical visit based on longitudinal Electronic Health Record (EHR) data. While deep learning (DL) models have recently demonstrated strong performance in predictive phenotyping, they require access to a large amount of labeled data, which are expensive to acquire. To address this label-insufficient challenge, we propose a deep dictionary learning framework (DDL) for phenotyping, which utilizes unlabeled data as a complementary source of information to generate a better, more succinct data representation. Our empirical evaluations on multiple EHR datasets demonstrated that DDL outperforms the existing predictive phenotyping methods on a wide variety of clinical tasks that require patient phenotyping. The results also show that unlabeled data can be used to generate better data representation that helps improve DDL’s phenotyping performance over existing methods that only uses labeled data.

[1]  Guergana K. Savova,et al.  Semi-supervised Learning for Phenotyping Tasks , 2015, AMIA.

[2]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[3]  Jimeng Sun,et al.  RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism , 2016, NIPS.

[4]  Volker Tresp,et al.  Predicting Clinical Events by Combining Static and Dynamic Information Using Recurrent Neural Networks , 2016, 2016 IEEE International Conference on Healthcare Informatics (ICHI).

[5]  Fei Wang,et al.  Health-ATM: A Deep Architecture for Multifaceted Patient Health Record Representation and Risk Prediction , 2018, SDM.

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  Le Song,et al.  GRAM: Graph-based Attention Model for Healthcare Representation Learning , 2016, KDD.

[8]  Mayank Vatsa,et al.  Deep Dictionary Learning , 2016, IEEE Access.

[9]  Casey S. Greene,et al.  Semi-supervised learning of the electronic health record for phenotype stratification , 2016, J. Biomed. Informatics.

[10]  Fei Wang,et al.  Patient Subtyping via Time-Aware LSTM Networks , 2017, KDD.

[11]  Jimeng Sun,et al.  MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare , 2018, NeurIPS.

[12]  Fenglong Ma,et al.  Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks , 2017, KDD.

[13]  Peter L. Bartlett,et al.  Alternating minimization for dictionary learning with random initialization , 2017, NIPS.

[14]  Walter F. Stewart,et al.  Doctor AI: Predicting Clinical Events via Recurrent Neural Networks , 2015, MLHC.