Neural Document Embeddings for Intensive Care Patient Mortality Prediction

We present an automatic mortality prediction scheme based on the unstructured textual content of clinical notes. Proposing a convolutional document embedding approach, our empirical investigation using the MIMIC-III intensive care database shows significant performance gains compared to previously employed methods such as latent topic distributions or generic doc2vec embeddings. These improvements are especially pronounced for the difficult problem of post-discharge mortality prediction.

[1]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[2]  Quoc V. Le,et al.  Semi-supervised Sequence Learning , 2015, NIPS.

[3]  Yan Liu,et al.  Recurrent Neural Networks for Multivariate Time Series with Missing Values , 2016, Scientific Reports.

[4]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[5]  Jonathan D. Wren,et al.  Data-Mining Analysis Suggests an Epigenetic Pathogenesis for Type 2 Diabetes , 2005, Journal of biomedicine & biotechnology.

[6]  Shamim Nemati,et al.  Machine Learning and Decision Support in Critical Care , 2016, Proceedings of the IEEE.

[7]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[8]  M. J. van der Laan,et al.  Mortality prediction in intensive care units with the Super ICU Learner Algorithm (SICULA): a population-based study. , 2015, The Lancet. Respiratory medicine.

[9]  E. Balas,et al.  Improving clinical practice using clinical decision support systems: a systematic review of trials to identify features critical to success , 2005, BMJ : British Medical Journal.

[10]  Wooju Kim,et al.  Sentiment classification for unlabeled dataset using Doc2Vec with JST , 2016, ICEC.

[11]  Timothy Baldwin,et al.  An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation , 2016, Rep4NLP@ACL.

[12]  L. Wienkers,et al.  Predicting in vivo drug interactions from in vitro drug discovery data , 2005, Nature Reviews Drug Discovery.

[13]  Alessandro Moschitti,et al.  Twitter Sentiment Analysis with Deep Convolutional Neural Networks , 2015, SIGIR.

[14]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[15]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[16]  Joelle Pineau,et al.  Hierarchical Neural Network Generative Models for Movie Dialogues , 2015, ArXiv.

[17]  W. Bruce Croft,et al.  Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval , 2011, SIGIR.

[18]  Carsten Eickhoff,et al.  Implicit Negative Feedback in Clinical Information Retrieval , 2016, ArXiv.

[19]  Mohammed Saeed,et al.  Risk Stratification of ICU Patients Using Topic Models Inferred from Unstructured Progress Notes , 2012, AMIA.

[20]  Anna Rumshisky,et al.  Unfolding physiological state: mortality modelling in intensive care units , 2014, KDD.

[21]  Peter Bauer,et al.  SAPS 3—From evaluation of the patient to evaluation of the intensive care unit. Part 2: Development of a prognostic model for hospital mortality at ICU admission , 2005, Intensive Care Medicine.