Shedding Light on Black Box Machine Learning Algorithms: Development of an Axiomatic Framework to Assess the Quality of Methods that Explain Individual Predictions

From self-driving vehicles and back-flipping robots to virtual assistants who book our next appointment at the hair salon or at that restaurant for dinner - machine learning systems are becoming increasingly ubiquitous. The main reason for this is that these methods boast remarkable predictive capabilities. However, most of these models remain black boxes, meaning that it is very challenging for humans to follow and understand their intricate inner workings. Consequently, interpretability has suffered under this ever-increasing complexity of machine learning models. Especially with regards to new regulations, such as the General Data Protection Regulation (GDPR), the necessity for plausibility and verifiability of predictions made by these black boxes is indispensable. Driven by the needs of industry and practice, the research community has recognised this interpretability problem and focussed on developing a growing number of so-called explanation methods over the past few years. These methods explain individual predictions made by black box machine learning models and help to recover some of the lost interpretability. With the proliferation of these explanation methods, it is, however, often unclear, which explanation method offers a higher explanation quality, or is generally better-suited for the situation at hand. In this thesis, we thus propose an axiomatic framework, which allows comparing the quality of different explanation methods amongst each other. Through experimental validation, we find that the developed framework is useful to assess the explanation quality of different explanation methods and reach conclusions that are consistent with independent research.

[1]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[2]  Stefan C. Kremer,et al.  Recurrent Neural Networks , 2013, Handbook on Neural Information Processing.

[3]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[4]  Benjamin Pfaff,et al.  Perturbation Analysis Of Optimization Problems , 2016 .

[5]  Sergei Vassilvitskii,et al.  Scalable K-Means++ , 2012, Proc. VLDB Endow..

[6]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[7]  David W. Opitz,et al.  An Empirical Evaluation of Bagging and Boosting , 1997, AAAI/IAAI.

[8]  David Weinberger,et al.  Accountability of AI Under the Law: The Role of Explanation , 2017, ArXiv.

[9]  L. Shapley,et al.  The Shapley Value , 1994 .

[10]  Daniel Neagu,et al.  Interpreting random forest classification models using a feature contribution method , 2013, IRI.

[11]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[12]  L. Shapley A Value for n-person Games , 1988 .

[13]  J. Hartigan Consistency of Single Linkage for High-Density Clusters , 1981 .

[14]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[15]  Geza Joos,et al.  On the Accuracy Versus Transparency Trade-Off of Data-Mining Models for Fast-Response PMU-Based Catastrophe Predictors , 2012, IEEE Transactions on Smart Grid.

[16]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[17]  Chris Russell,et al.  Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR , 2017, ArXiv.

[18]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[19]  F. Heider,et al.  An experimental study of apparent behavior , 1944 .

[20]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[21]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[22]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[23]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[24]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[25]  Seth Flaxman,et al.  EU regulations on algorithmic decision-making and a "right to explanation" , 2016, ArXiv.

[26]  Alexander Binder,et al.  Layer-Wise Relevance Propagation for Neural Networks with Local Renormalization Layers , 2016, ICANN.

[27]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[28]  Harris Mateen Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy , 2018 .

[29]  Gregory W. Corder,et al.  Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach , 2009 .

[30]  P. Jaccard Distribution de la flore alpine dans le bassin des Dranses et dans quelques régions voisines , 1901 .

[31]  Benoît Frénay,et al.  Interpretability of machine learning models and representations: an introduction , 2016, ESANN.

[32]  Luciano Floridi,et al.  Why a Right to Explanation of Automated Decision-Making Does Not Exist in the General Data Protection Regulation , 2017 .

[33]  Scott M. Lundberg,et al.  Consistent feature attribution for tree ensembles , 2017, ArXiv.

[34]  Claus Lenz,et al.  A Non-Technical Survey on Deep Convolutional Neural Network Architectures , 2018, ArXiv.

[35]  Stephen Grossberg,et al.  Recurrent neural networks , 2013, Scholarpedia.

[36]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[37]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[38]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[39]  Sameer Singh,et al.  “Why Should I Trust You?”: Explaining the Predictions of Any Classifier , 2016, NAACL.

[40]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[41]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[42]  Johannes Gehrke,et al.  Intelligible models for classification and regression , 2012, KDD.