论文信息 - Explaining Deep Learning Models - A Bayesian Non-parametric Approach

Explaining Deep Learning Models - A Bayesian Non-parametric Approach

Understanding and interpreting how machine learning (ML) models make decisions have been a big challenge. While recent research has proposed various technical approaches to provide some clues as to how an ML model makes individual predictions, they cannot provide users with an ability to inspect a model as a complete entity. In this work, we propose a novel technical approach that augments a Bayesian non-parametric regression mixture model with multiple elastic nets. Using the enhanced mixture model, we can extract generalizable insights for a target model through a global approximation. To demonstrate the utility of our approach, we evaluate it on different ML models in the context of image recognition. The empirical results indicate that our proposed approach not only outperforms the state-of-the-art techniques in explaining individual decisions but also provides users with an ability to discover the vulnerabilities of the target ML models.

[1] Mike Wu,et al. Beyond Sparsity: Tree Regularization of Deep Models for Interpretability , 2017, AAAI.

[2] Cliburn Chan,et al. Understanding GPU Programming for Statistical Computation: Studies in Massively Parallel Massive Mixtures , 2010, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[3] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[4] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[5] Qing Li,et al. The Bayesian elastic net , 2010 .

[6] Ken Lang,et al. NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[7] Adrian F. M. Smith,et al. Bayesian computation via the gibbs sampler and related markov chain monte carlo methods (with discus , 1993 .

[8] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[9] H. Zou,et al. Regularization and variable selection via the elastic net , 2005 .

[10] Faming Liang,et al. A Bootstrap Metropolis–Hastings Algorithm for Bayesian Analysis of Big Data , 2016, Technometrics.

[11] Yi Yang,et al. DevNet: A Deep Event Network for multimedia event detection and evidence recounting , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Ankur Taly,et al. Gradients of Counterfactuals , 2016, ArXiv.

[13] Percy Liang,et al. Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[14] Volkan Cevher,et al. WASP: Scalable Bayes via barycenters of subset posteriors , 2015, AISTATS.