Explaining black box models by means of local rules

Many high performance machine learning methods produce black box models, which do not disclose their internal logic yielding the prediction. However, in many application domains understanding the motivation of a prediction is becoming a requisite to trust the prediction itself. We propose a novel rule-based method that explains the prediction of any classifier on a specific instance by analyzing the joint effect of feature subsets on the classifier prediction. The relevant subsets are identified by learning a local rule-based model in the neighborhood of the prediction to explain. While local rules give a qualitative insight of the local behavior, their relevance is quantified by using the concept of prediction difference. Preliminary experiments show that, despite the approximation introduced by the local model, the explanations provided by our method are effective in detecting the effects of attribute correlation. Our method is model-agnostic. Hence, experts can compare explanations and local behaviors of the predictions for the same instance made by different classifiers.

[1]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[2]  Marko Robnik-Sikonja,et al.  Explaining Classifications For Individual Instances , 2008, IEEE Transactions on Knowledge and Data Engineering.

[3]  Elena Baralis,et al.  A Lazy Approach to Associative Classification , 2008, IEEE Transactions on Knowledge and Data Engineering.

[4]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[5]  Raphaël Féraud,et al.  Contact personalization using a score understanding method , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[6]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[7]  Erik Strumbelj,et al.  Explaining instance classifications with interactions of subsets of feature values , 2009, Data Knowl. Eng..

[8]  Kwan-Liu Ma,et al.  Opening the black box - data driven visualization of neural networks , 2005, VIS 05. IEEE Visualization, 2005..

[9]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[10]  Andrew P. Bradley,et al.  Rule extraction from support vector machines: A review , 2010, Neurocomputing.

[11]  Joachim Diederich,et al.  Survey and critique of techniques for extracting rules from trained artificial neural networks , 1995, Knowl. Based Syst..

[12]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[13]  Joachim Diederich,et al.  Rule Extraction from Support Vector Machines , 2008, Studies in Computational Intelligence.

[14]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[15]  Martin Mozina,et al.  Orange: data mining toolbox in python , 2013, J. Mach. Learn. Res..

[16]  Carlos Guestrin,et al.  Model-Agnostic Interpretability of Machine Learning , 2016, ArXiv.

[17]  Anthony K. H. Tung,et al.  EasySVM: A visual analysis approach for open-box support vector machines , 2017, Computational Visual Media.

[18]  Glenn Fung,et al.  Rule extraction from linear support vector machines , 2005, KDD '05.

[19]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[20]  Franco Turini,et al.  Local Rule-Based Explanations of Black Box Decision Systems , 2018, ArXiv.

[21]  Ivan Bratko,et al.  Nomograms for visualizing support vector machines , 2005, KDD '05.

[22]  Bart Baesens,et al.  Decompositional Rule Extraction from Support Vector Machines by Active Learning , 2009, IEEE Transactions on Knowledge and Data Engineering.

[23]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[24]  Martin Mozina,et al.  Nomograms for Visualization of Naive Bayesian Classifier , 2004, PKDD.

[25]  Jude W. Shavlik,et al.  in Advances in Neural Information Processing , 1996 .

[26]  Houtao Deng,et al.  Interpreting tree ensembles with inTrees , 2018, International Journal of Data Science and Analytics.

[27]  Seth Flaxman,et al.  European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..

[28]  Andrea Vedaldi,et al.  Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Satoshi Hara,et al.  Making Tree Ensembles Interpretable , 2016, 1606.05390.

[30]  David G. Stork,et al.  Pattern Classification , 1973 .

[31]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[32]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[33]  Shengyi Jiang,et al.  An improved K-nearest-neighbor algorithm for text categorization , 2012, Expert Syst. Appl..

[34]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[35]  Vasant Honavar,et al.  Gaining insights into support vector machine pattern classifiers using projection-based tour methods , 2001, KDD '01.

[36]  I. Askira-Gelman,et al.  Knowledge discovery: comprehensibility of the results , 1998, Proceedings of the Thirty-First Hawaii International Conference on System Sciences.

[37]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[38]  Jason Yosinski,et al.  Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks , 2016, ArXiv.

[39]  Erik Strumbelj,et al.  An Efficient Explanation of Individual Classifications using Game Theory , 2010, J. Mach. Learn. Res..

[40]  Robin Gras,et al.  Rule Extraction from Random Forest: the RF+HC Methods , 2015, Canadian Conference on AI.