Interpretable Deep Models for ICU Outcome Prediction

Exponential surge in health care data, such as longitudinal data from electronic health records (EHR), sensor data from intensive care unit (ICU), etc., is providing new opportunities to discover meaningful data-driven characteristics and patterns ofdiseases. Recently, deep learning models have been employedfor many computational phenotyping and healthcare prediction tasks to achieve state-of-the-art performance. However, deep models lack interpretability which is crucial for wide adoption in medical research and clinical decision-making. In this paper, we introduce a simple yet powerful knowledge-distillation approach called interpretable mimic learning, which uses gradient boosting trees to learn interpretable models and at the same time achieves strong prediction performance as deep learning models. Experiment results on Pediatric ICU dataset for acute lung injury (ALI) show that our proposed method not only outperforms state-of-the-art approaches for morality and ventilator free days prediction tasks but can also provide interpretable models to clinicians.

[1]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[2]  Ruslan Salakhutdinov,et al.  Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.

[3]  T. Lasko,et al.  Computational Phenotype Discovery Using Unsupervised Feature Learning over Noisy, Sparse, and Irregular Clinical Data , 2013, PloS one.

[4]  Rich Caruana,et al.  Model compression , 2006, KDD '06.

[5]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[6]  T. Alonzo,et al.  Effect of tidal volume in children with acute hypoxemic respiratory failure , 2009, Intensive Care Medicine.

[7]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[8]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[9]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[12]  Li Li,et al.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records , 2016, Scientific Reports.

[13]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[14]  Razvan Pascanu,et al.  Policy Distillation , 2015, ICLR.

[15]  Yifan Gong,et al.  Learning small-size DNN with output-distribution-based criteria , 2014, INTERSPEECH.

[16]  G. Bonner Decision making for health care professionals: use of decision trees within the community mental health setting. , 2001, Journal of advanced nursing.

[17]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[18]  Rich Caruana,et al.  Do Deep Nets Really Need to be Deep? , 2013, NIPS.

[19]  Peter Dayan,et al.  Computational Phenotyping of Two-Person Interactions Reveals Differential Neural Response to Depth-of-Thought , 2012, PLoS Comput. Biol..

[20]  Jimeng Sun,et al.  Marble: high-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization , 2014, KDD.

[21]  Aasthaa Bansal,et al.  Further insight into the incremental value of new markers: the interpretation of performance measures and the importance of clinical context. , 2012, American journal of epidemiology.

[22]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[23]  Pei-Chann Chang,et al.  A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification , 2011, Appl. Soft Comput..

[24]  Suchi Saria,et al.  Clustering Longitudinal Clinical Marker Trajectories from Electronic Health Data: Applications to Phenotyping and Endotype Discovery , 2015, AAAI.

[25]  Fei Wang,et al.  From micro to macro: data driven phenotyping by densification of longitudinal electronic medical records , 2014, KDD.

[26]  Lei Lei,et al.  R-C4.5 decision tree model and its applications to health care dataset , 2005, Proceedings of ICSSSM '05. 2005 International Conference on Services Systems and Services Management, 2005..

[27]  Darcy A. Davis,et al.  Bringing Big Data to Personalized Healthcare: A Patient-Centered Framework , 2013, Journal of General Internal Medicine.

[28]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[29]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[30]  Yan Liu,et al.  Deep Computational Phenotyping , 2015, KDD.

[31]  George Hripcsak,et al.  Next-generation phenotyping of electronic health records , 2012, J. Am. Medical Informatics Assoc..

[32]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[33]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[34]  Benjamin M. Marlin,et al.  Unsupervised pattern discovery in electronic health care data using probabilistic clustering models , 2012, IHI '12.

[35]  Yan Liu,et al.  Causal Phenotype Discovery via Deep Networks , 2015, AMIA.

[36]  J. Friedman Stochastic gradient boosting , 2002 .