Explaining Deep Classification of Time-Series Data with Learned Prototypes

The emergence of deep learning networks raises a need for explainable AI so that users and domain experts can be confident applying them to high-risk decisions. In this paper, we leverage data from the latent space induced by deep learning models to learn stereotypical representations or "prototypes" during training to elucidate the algorithmic decision-making process. We study how leveraging prototypes effect classification decisions of two dimensional time-series data in a few different settings: (1) electrocardiogram (ECG) waveforms to detect clinical bradycardia, a slowing of heart rate, in preterm infants, (2) respiration waveforms to detect apnea of prematurity, and (3) audio waveforms to classify spoken digits. We improve upon existing models by optimizing for increased prototype diversity and robustness, visualize how these prototypes in the latent space are used by the model to distinguish classes, and show that prototypes are capable of learning features on two dimensional time-series data to produce explainable insights during classification tasks. We show that the prototypes are capable of learning real-world features - bradycardia in ECG, apnea in respiration, and articulation in speech - as well as features within sub-classes. Our novel work leverages learned prototypical framework on two dimensional time-series data to produce explainable insights during classification tasks.

[1]  Danny Eytan,et al.  Towards Understanding ECG Rhythm Classification Using Convolutional Neural Networks and Attention Mappings , 2018, MLHC.

[2]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[3]  C. Ji An Archetypal Analysis on , 2005 .

[4]  Jeffrey M. Hausdorff,et al.  Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[5]  Quoc V. Le,et al.  SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.

[6]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[7]  C. Poets,et al.  Cardiorespiratory events in preterm infants: interventions and consequences , 2016, Journal of Perinatology.

[8]  Cynthia Rudin,et al.  Please Stop Explaining Black Box Models for High Stakes Decisions , 2018, ArXiv.

[9]  M. Chambrin Alarms in the intensive care unit: how can the number of false alarms be reduced? , 2001, Critical care.

[10]  Yacov Rabi,et al.  Association Between Intermittent Hypoxemia or Bradycardia and Late Death or Disability in Extremely Preterm Infants. , 2015, JAMA.

[11]  J. Volpe,et al.  Episodes of apnea and bradycardia in the preterm newborn: impact on cerebral circulation. , 1985, Pediatrics.

[12]  Gerhard Pichler,et al.  Impact of bradycardia on cerebral oxygenation and cerebral blood volume during apnoea in preterm infants. , 2003, Physiological measurement.

[13]  U. Rajendra Acharya,et al.  Deep learning for healthcare applications based on physiological signals: A review , 2018, Comput. Methods Programs Biomed..

[14]  Riccardo Barbieri,et al.  Predicting Bradycardia in Preterm Infants Using Point Process Analysis of Heart Rate , 2017, IEEE Transactions on Biomedical Engineering.

[15]  Cynthia Rudin,et al.  This Looks Like That: Deep Learning for Interpretable Image Recognition , 2018 .

[16]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[18]  Xavier Serra,et al.  End-to-end Learning for Music Audio Tagging at Scale , 2017, ISMIR.

[19]  Mohammad Taha Bahadori,et al.  Temporal-Clustering Invariance in Irregular Healthcare Time Series , 2019, ArXiv.

[20]  Germain Forestier,et al.  Deep learning for time series classification: a review , 2018, Data Mining and Knowledge Discovery.

[21]  Fernando Diaz,et al.  Towards a Fair Marketplace: Counterfactual Evaluation of the trade-off between Relevance, Fairness & Satisfaction in Recommendation Systems , 2018, CIKM.

[22]  James R. Williamson,et al.  Forecasting respiratory collapse: Theory and practice for averting life-threatening infant apneas , 2013, Respiratory Physiology & Neurobiology.

[23]  Cynthia Rudin,et al.  Deep Learning for Case-based Reasoning through Prototypes: A Neural Network that Explains its Predictions , 2017, AAAI.

[24]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[25]  Manuel B. Schmid,et al.  Cerebral Oxygenation during Intermittent Hypoxemia and Bradycardia in Preterm Infants , 2014, Neonatology.

[26]  M. Miller,et al.  Apnea of Prematurity , 2018, Pediatric Sleep Medicine.

[27]  U. Rajendra Acharya,et al.  Arrhythmia detection using deep convolutional neural network with long duration ECG signals , 2018, Comput. Biol. Medicine.

[28]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.