Assessing Heuristic Machine Learning Explanations with Model Counting

Machine Learning (ML) models are widely used in decision making procedures in finance, medicine, education, etc. In these areas, ML outcomes can directly affect humans, e.g. by deciding whether a person should get a loan or be released from prison. Therefore, we cannot blindly rely on black box ML models and need to explain the decisions made by them. This motivated the development of a variety of ML-explainer systems, including LIME and its successor \({\textsc {Anchor}}\). Due to the heuristic nature of explanations produced by existing tools, it is necessary to validate them. We propose a SAT-based method to assess the quality of explanations produced by \({\textsc {Anchor}}\). We encode a trained ML model and an explanation for a given prediction as a propositional formula. Then, by using a state-of-the-art approximate model counter, we estimate the quality of the provided explanation as the number of solutions supporting it.

[1]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[2]  Luca Pulina,et al.  Automated Verification of Neural Networks: Advances, Challenges and Perspectives , 2018, ArXiv.

[3]  Adnan Darwiche,et al.  Verifying Binarized Neural Networks by Local Automaton Learning , 2019 .

[4]  Seinosuke Toda,et al.  PP is as Hard as the Polynomial-Time Hierarchy , 1991, SIAM J. Comput..

[5]  Andrew Slavin Ross,et al.  Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing their Input Gradients , 2017, AAAI.

[6]  Mate Soos,et al.  BIRD: Engineering an Efficient CNF-XOR SAT Solver and Its Applications to Approximate Model Counting , 2019, AAAI.

[7]  Alberto L. Sangiovanni-Vincentelli,et al.  A Formalization of Robustness for Deep Neural Networks , 2019, ArXiv.

[8]  Sharad Malik,et al.  On computing minimal independent support and its applications to sampling and counting , 2015, Constraints.

[9]  Larry Carter,et al.  Universal classes of hash functions (Extended Abstract) , 1977, STOC '77.

[10]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[11]  Supratik Chakraborty,et al.  Algorithmic Improvements in Approximate Counting for Probabilistic Inference: From Linear to Logarithmic SAT Calls , 2016, IJCAI.

[12]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[13]  Andrew Slavin Ross,et al.  Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations , 2017, IJCAI.

[14]  Carsten Sinz,et al.  Towards an Optimal CNF Encoding of Boolean Cardinality Constraints , 2005, CP.

[15]  Geoffrey E. Hinton,et al.  Distilling a Neural Network Into a Soft Decision Tree , 2017, CEx@AI*IA.

[16]  Henry A. Kautz,et al.  Performing Bayesian Inference by Weighted Model Counting , 2005, AAAI.

[17]  Adnan Darwiche,et al.  A Symbolic Approach to Explaining Bayesian Network Classifiers , 2018, IJCAI.

[18]  Jean-Marie Lagniez,et al.  An Improved Decision-DNNF Compiler , 2017, IJCAI.

[19]  Mike Wu,et al.  Beyond Sparsity: Tree Regularization of Deep Models for Interpretability , 2017, AAAI.

[20]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[21]  Larry Carter,et al.  Universal Classes of Hash Functions , 1979, J. Comput. Syst. Sci..

[22]  Christian J. Muise,et al.  Dsharp: Fast d-DNNF Compilation with sharpSAT , 2012, Canadian Conference on AI.

[23]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[24]  Claude Castelluccia,et al.  Extending SAT Solvers to Cryptographic Problems , 2009, SAT.

[25]  Cynthia Rudin,et al.  Deep Learning for Case-based Reasoning through Prototypes: A Neural Network that Explains its Predictions , 2017, AAAI.

[26]  Been Kim,et al.  Sanity Checks for Saliency Maps , 2018, NeurIPS.

[27]  Quanshi Zhang,et al.  Interpreting CNNs via Decision Trees , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Bart Selman,et al.  Taming the Curse of Dimensionality: Discrete Integration by Hashing and Optimization , 2013, ICML.

[29]  Marc Thurley,et al.  sharpSAT - Counting Models with Advanced Component Caching and Implicit BCP , 2006, SAT.

[30]  Adnan Darwiche,et al.  Verifying Binarized Neural Networks by Angluin-Style Learning , 2019, SAT.

[31]  Supratik Chakraborty,et al.  A Scalable Approximate Model Counter , 2013, CP.

[32]  Pierre Marquis,et al.  A Knowledge Compilation Map , 2002, J. Artif. Intell. Res..

[33]  Toby Walsh,et al.  Handbook of Satisfiability: Volume 185 Frontiers in Artificial Intelligence and Applications , 2009 .

[34]  Tommi S. Jaakkola,et al.  Towards Robust Interpretability with Self-Explaining Neural Networks , 2018, NeurIPS.

[35]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[36]  Joao Marques-Silva,et al.  Abduction-Based Explanations for Machine Learning Models , 2018, AAAI.

[37]  Richard M. Karp,et al.  An optimal algorithm for Monte Carlo estimation , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[38]  Leslie G. Valiant,et al.  The Complexity of Enumeration and Reliability Problems , 1979, SIAM J. Comput..

[39]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[40]  Leonid Ryzhyk,et al.  Verifying Properties of Binarized Deep Neural Networks , 2017, AAAI.