论文信息 - i-Algebra: Towards Interactive Interpretability of Deep Neural Networks

i-Algebra: Towards Interactive Interpretability of Deep Neural Networks

Providing explanations for deep neural networks (DNNs) is essential for their use in domains wherein the interpretability of decisions is a critical prerequisite. Despite the plethora of work on interpreting DNNs, most existing solutions offer interpretability in an ad hoc, one-shot, and static manner, without accounting for the perception, understanding, or response of end-users, resulting in their poor usability in practice. In this paper, we argue that DNN interpretability should be implemented as the interactions between users and models. We present i-Algebra, a first-of-its-kind interactive framework for interpreting DNNs. At its core is a library of atomic, composable operators, which explain model behaviors at varying input granularity, during different inference stages, and from distinct interpretation perspectives. Leveraging a declarative query language, users are enabled to build various analysis tools (e.g.,"drill-down","comparative","what-if"analysis) via flexibly composing such operators. We prototype i-Algebra and conduct user studies in a set of representative analysis tasks, including inspecting adversarial inputs, resolving model inconsistency, and cleansing contaminated data, all demonstrating its promising usability.

Fenglong Ma | Shouling Ji | Xinyang Zhang | Ren Pang | Ting Wang

[1] Yarin Gal,et al. Real Time Image Saliency for Black Box Classifiers , 2017, NIPS.

[2] Ting Wang,et al. DEEPSEC: A Uniform Platform for Security Analysis of Deep Learning Model , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[3] Xiangyu Zhang,et al. Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples , 2018, NeurIPS.

[4] Kenney Ng,et al. Interacting with Predictions: Visual Inspection of Black-box Machine Learning Models , 2016, CHI.

[5] Markus H. Gross,et al. Explaining Deep Neural Networks with a Polynomial Time Algorithm for Shapley Values Approximation , 2019, ICML.

[6] Fei-Fei Li,et al. Visualizing and Understanding Recurrent Networks , 2015, ArXiv.

[7] Dan Boneh,et al. Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[8] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[9] Jerry Li,et al. Spectral Signatures in Backdoor Attacks , 2018, NeurIPS.

[10] Andrea Vedaldi,et al. Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[11] Quanshi Zhang,et al. Interpretable Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[13] Ting Wang,et al. Interpretable Deep Learning under Fire , 2018, USENIX Security Symposium.

[14] Enrico Bertini,et al. Using Visual Analytics to Interpret Predictive Machine Learning Models , 2016, ArXiv.

[15] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[16] Le Song,et al. L-Shapley and C-Shapley: Efficient Model Interpretation for Structured Data , 2018, ICLR.

[17] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[18] Eric Horvitz,et al. Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure , 2018, HCOMP.

[19] Brendan Dolan-Gavitt,et al. BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.

[20] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.

[21] Gang Wang,et al. LEMNA: Explaining Deep Learning based Security Applications , 2018, CCS.

[22] Tudor Dumitras,et al. Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks , 2018, NeurIPS.

[23] Jeffrey Heer,et al. Errudite: Scalable, Reproducible, and Testable Error Analysis , 2019, ACL.