Doctor XAI: an ontology-based approach to black-box sequential data classification explanations

Several recent advancements in Machine Learning involve blackbox models: algorithms that do not provide human-understandable explanations in support of their decisions. This limitation hampers the fairness, accountability and transparency of these models; the field of eXplainable Artificial Intelligence (XAI) tries to solve this problem providing human-understandable explanations for black-box models. However, healthcare datasets (and the related learning tasks) often present peculiar features, such as sequential data, multi-label predictions, and links to structured background knowledge. In this paper, we introduce Doctor XAI, a model-agnostic explainability technique able to deal with multi-labeled, sequential, ontology-linked data. We focus on explaining Doctor AI, a multilabel classifier which takes as input the clinical history of a patient in order to predict the next visit. Furthermore, we show how exploiting the temporal dimension in the data and the domain knowledge encoded in the medical ontology improves the quality of the mined explanations.

[1]  Matthew Hutson,et al.  AI researchers allege that machine learning is alchemy , 2018 .

[2]  Franco Turini,et al.  Local Rule-Based Explanations of Black Box Decision Systems , 2018, ArXiv.

[3]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[4]  Ahmad Fayez S. Althobaiti Comparison of Ontology-Based Semantic-Similarity Measures in the Biomedical Text , 2017 .

[5]  Parisa Rashidi,et al.  Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis , 2017, IEEE Journal of Biomedical and Health Informatics.

[6]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[7]  Stephan Dreiseitl,et al.  Using concept hierarchies to improve calculation of patient similarity , 2016, J. Biomed. Informatics.

[8]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[9]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[10]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[11]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[12]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[13]  Andrew L. Beam,et al.  Adversarial attacks on medical machine learning , 2019, Science.

[14]  Anna Monreale,et al.  Explaining Multi-label Black-Box Classifiers for Health Applications , 2019, Precision Health and Medicine.

[15]  Fenglong Ma,et al.  Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks , 2017, KDD.

[16]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[17]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[18]  Isaac S Kohane,et al.  Artificial Intelligence in Healthcare , 2019, Artificial Intelligence and Machine Learning for Business for Non-Engineers.

[19]  Li Li,et al.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records , 2016, Scientific Reports.

[20]  Mounia Lalmas,et al.  Reader preferences and behavior on Wikipedia , 2014, HT.

[21]  Walter F. Stewart,et al.  Doctor AI: Predicting Clinical Events via Recurrent Neural Networks , 2015, MLHC.

[22]  Yan Liu,et al.  Deep Computational Phenotyping , 2015, KDD.

[23]  Jimeng Sun,et al.  RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism , 2016, NIPS.

[24]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[25]  Mohammad Khalilia,et al.  Improving disease prediction using ICD-9 ontological features , 2011, 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011).

[26]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[27]  Jimeng Sun,et al.  RAIM: Recurrent Attentive and Intensive Model of Multimodal Patient Monitoring Data , 2018, KDD.

[28]  Dino Pedreschi,et al.  Market Basket Prediction Using User-Centric Temporal Annotated Recurring Sequences , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[29]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[30]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[31]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[32]  Le Song,et al.  GRAM: Graph-based Attention Model for Healthcare Representation Learning , 2016, KDD.

[33]  Giovanni Comandé,et al.  Why a Right to Legibility of Automated Decision-Making Exists in the General Data Protection Regulation , 2017 .

[34]  Huilong Duan,et al.  Using the distance between sets of hierarchical taxonomic clinical concepts to measure patient similarity , 2019, BMC Medical Informatics and Decision Making.

[35]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[36]  Jimeng Sun,et al.  Medical Concept Representation Learning from Electronic Health Records and its Application on Heart Failure Prediction , 2016, ArXiv.

[37]  Shanshan Zhang,et al.  Interpretable Representation Learning for Healthcare via Capturing Disease Progression through Time , 2018, KDD.

[38]  Jeffrey Dean,et al.  Scalable and accurate deep learning with electronic health records , 2018, npj Digital Medicine.

[39]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[40]  Jingnan Liu,et al.  Investigating the impacting factors for the healthcare professionals to adopt artificial intelligence-based medical diagnosis support system (AIMDSS) , 2018, Annals of Operations Research.

[41]  Les E. Atlas,et al.  Interpretable Recurrent Neural Networks Using Sequential Sparse Recovery , 2016, ArXiv.

[42]  Hong Yu,et al.  Bidirectional RNN for Medical Event Detection in Electronic Health Records , 2016, NAACL.

[43]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[44]  T. Lasko,et al.  Computational Phenotype Discovery Using Unsupervised Feature Learning over Noisy, Sparse, and Irregular Clinical Data , 2013, PloS one.

[45]  Vipin Kumar,et al.  Mining Electronic Health Records: A Survey , 2017, 1702.03222.

[46]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[47]  Byron C. Wallace,et al.  Attention is not Explanation , 2019, NAACL.

[48]  Xiangji Huang,et al.  Deep learning for healthcare decision making with EMRs , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).