论文信息 - Considerations for Evaluation and Generalization in Interpretable Machine Learning

Considerations for Evaluation and Generalization in Interpretable Machine Learning

As machine learning systems become ubiquitous, there has been a surge of interest in interpretable machine learning: systems that provide explanation for their outputs. These explanations are often used to qualitatively assess other criteria such as safety or non-discrimination. However, despite the interest in interpretability, there is little consensus on what interpretable machine learning is and how it should be measured and evaluated. In this paper, we discuss a definitions of interpretability and describe when interpretability is needed (and when it is not). Finally, we talk about a taxonomy for rigorous evaluation, and recommendations for researchers. We will end with discussing open questions and concrete problems for new researchers.

Been Kim | Finale Doshi-Velez | Finale Doshi-Velez | Been Kim | F. Doshi-Velez

[1] Claudio Carpineto,et al. A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[2] L. Strahilevitz. Privacy versus Antidiscrimination , 2007 .

[3] T. Lombrozo. The structure and function of explanations , 2006, Trends in Cognitive Sciences.

[4] Andrew Slavin Ross,et al. Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations , 2017, IJCAI.

[5] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[6] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[7] Maya R. Gupta,et al. Monotonic Calibrated Interpolated Look-Up Tables , 2015, J. Mach. Learn. Res..

[8] Clemens Otte,et al. Safe and Interpretable Machine Learning: A Methodological Review , 2013 .

[9] Carlos Guestrin,et al. Programs as Black-Box Explanations , 2016, ArXiv.

[10] Dympna O'Sullivan,et al. The Role of Explanations on Trust and Reliance in Clinical Decision Support Systems , 2015, 2015 International Conference on Healthcare Informatics.

[11] Seth Flaxman,et al. EU regulations on algorithmic decision-making and a "right to explanation" , 2016, ArXiv.

[12] Cynthia Rudin,et al. Falling Rule Lists , 2014, AISTATS.

[13] Zachary Chase Lipton. The mythos of model interpretability , 2016, ACM Queue.

[14] Tapio Elomaa,et al. In Defense of C4.5: Notes in Learning One-Level Decision Trees , 1994, ICML.

[15] Boris Hayete,et al. GOTrees: Predicting GO Associations from Protein Domain Composition Using Decision Trees , 2004, Pacific Symposium on Biocomputing.

[16] Anna Shcherbina,et al. Not Just a Black Box: Learning Important Features Through Propagating Activation Differences , 2016, ArXiv.

[17] Toniann Pitassi,et al. Fairness through awareness , 2011, ITCS '12.

[18] Samuel J. Gershman,et al. Compositional Inductive Biases in Function Learning , 2016, bioRxiv.

[19] Franco Turini,et al. Data mining for discrimination discovery , 2010, TKDD.

[20] F. Keil,et al. Explanation and understanding , 2015 .

[21] D. Sculley,et al. Hidden Technical Debt in Machine Learning Systems , 2015, NIPS.

[22] C. Hempel,et al. Studies in the Logic of Explanation , 1948, Philosophy of Science.

[23] N. Chater,et al. Mental Mechanisms: Speculations on Human Causal Learning and Reasoning , 2005 .

[24] Stefan Rüping,et al. Learning interpretable models , 2006 .

[25] Ferat Sahin,et al. A survey on feature selection methods , 2014, Comput. Electr. Eng..

[26] Bart Baesens,et al. An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models , 2011, Decis. Support Syst..

[27] Michael C. Hughes,et al. Supervised topic models for clinical interpretability , 2016, 1612.01678.

[28] Finale Doshi-Velez,et al. Mind the Gap: A Generative Approach to Interpretable Feature Selection and Extraction , 2015, NIPS.

[29] Cynthia Rudin,et al. Supersparse linear integer models for optimized medical scoring systems , 2015, Machine Learning.

[30] Cynthia Rudin,et al. The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification , 2014, NIPS.

[31] Pedro Antunes,et al. Structuring dimensions for collaborative systems evaluation , 2012, CSUR.

[32] Been Kim,et al. Inferring Robot Task Plans from Human Team Meetings: A Generative Modeling Approach with Logic-Based Prior , 2013, AAAI.

[33] Alex Alves Freitas,et al. Comprehensible classification models: a position paper , 2014, SKDD.

[34] Chong Wang,et al. Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[35] Kush R. Varshney,et al. On the Safety of Machine Learning: Cyber-Physical Systems, Decision Sciences, and Data Products , 2016, Big Data.

[36] Been Kim,et al. Inferring Team Task Plans from Human Meetings: A Generative Modeling Approach with Logic-Based Prior , 2015, J. Artif. Intell. Res..

[37] Regina Barzilay,et al. Rationalizing Neural Predictions , 2016, EMNLP.

[38] Candice M. Mills,et al. What lies beneath? Understanding the limits of understanding , 2004 .

[39] Klaus-Robert Müller,et al. PatternNet and PatternLRP - Improving the interpretability of neural networks , 2017, ArXiv.

[40] S. Glennan. Rethinking Mechanistic Explanation , 2002, Philosophy of Science.

[41] Suresh Venkatasubramanian,et al. Auditing black-box models for indirect influence , 2016, Knowledge and Information Systems.

[42] Mykel J. Kochenderfer,et al. Next-Generation Airborne Collision Avoidance System , 2012 .

[43] Finale Doshi-Velez,et al. Comorbidity Clusters in Autism Spectrum Disorders: An Electronic Health Record Time-Series Analysis , 2014, Pediatrics.

[44] Weiwei Liu,et al. Sparse Perceptron Decision Tree for Millions of Dimensions , 2016, AAAI.

[45] Johannes Gehrke,et al. Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[46] Cynthia Rudin,et al. Bayesian Rule Sets for Interpretable Classification , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[47] Niklas Lavesson,et al. User-oriented Assessment of Classification Model Understandability , 2011, SCAI.

[48] Been Kim,et al. iBCM: Interactive Bayesian Case Model Empowering Humans via Intuitive Interaction , 2015 .

[49] Hanspeter Pfister,et al. Automatic Neural Reconstruction from Petavoxel of Electron Microscopy Data , 2016 .

[50] Rich Caruana,et al. Model compression , 2006, KDD '06.

[51] Nathan Srebro,et al. Equality of Opportunity in Supervised Learning , 2016, NIPS.

[52] Girish H. Subramanian,et al. A comparison of the decision table and tree , 1992, CACM.

[53] Avanti Shrikumar,et al. Not Just A Black Box : Interpretable Deep Learning by Propagating Activation Differences , 2016 .

[54] Ryan P. Adams,et al. Graph-Sparse LDA: A Topic Model with Structured Sparsity , 2014, AAAI.

[55] Neil T. Heffernan,et al. AXIS: Generating Explanations at Scale with Learnersourcing and Machine Learning , 2016, L@S.

[56] Martin Wattenberg,et al. SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[57] W. Bechtel,et al. Explanation: a mechanist alternative. , 2005, Studies in history and philosophy of biological and biomedical sciences.

[58] Harry Hochheiser,et al. Research Methods for Human-Computer Interaction , 2008 .

[59] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.

[60] Kunal Talwar,et al. On the geometry of differential privacy , 2009, STOC '10.

[61] Andrew Gordon Wilson,et al. The Human Kernel , 2015, NIPS.

[62] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[63] Johannes Gehrke,et al. Intelligible models for classification and regression , 2012, KDD.

[64] David A. Landgrebe,et al. A survey of decision tree classifier methodology , 1991, IEEE Trans. Syst. Man Cybern..

[65] Helen Nissenbaum,et al. Adnostic: Privacy Preserving Targeted Advertising , 2010, NDSS.

[66] Weng-Keen Wong,et al. Too much, too little, or just right? Ways explanations impact end users' mental models , 2013, 2013 IEEE Symposium on Visual Languages and Human Centric Computing.

[67] Eliezer Yudkowsky,et al. The Ethics of Artificial Intelligence , 2014, Artificial Intelligence Safety and Security.

[68] G. A. Miller. THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[69] Jure Leskovec,et al. Interpretable Decision Sets: A Joint Framework for Description and Prediction , 2016, KDD.

[70] Finale Doshi-Velez,et al. Increasing the Interpretability of Recurrent Neural Networks Using Hidden Markov Models , 2016, ArXiv.

[71] Tahir Mehmood,et al. A review of variable selection methods in Partial Least Squares Regression , 2012 .

[72] Joe Walsh,et al. Identifying Police Officers at Risk of Adverse Events , 2016, KDD.

[73] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.