Black-Box Model Explained Through an Assessment of Its Interpretable Features

Algorithms are powerful and necessary tools behind a large part of the information we use every day. However, they may introduce new sources of bias, discrimination and other unfair practices that affect people who are unaware of it. Greater algorithm transparency is indispensable to provide more credible and reliable services. Moreover, requiring developers to design transparent algorithm-driven applications allows them to keep the model accessible and human understandable, increasing the trust of end users. In this paper we present EBAnO, a new engine able to produce prediction-local explanations for a black-box model exploiting interpretable feature perturbations. EBAnO exploits the hypercolumns representation together with the cluster analysis to identify a set of interpretable features of images. Furthermore two indices have been proposed to measure the influence of input features on the final prediction made by a CNN model. EBAnO has been preliminary tested on a set of heterogeneous images. The results highlight the effectiveness of EBAnO in explaining the CNN classification through the evaluation of interpretable features influence.

[1]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[2]  Murat Kantarcioglu,et al.  Detecting Discrimination in a Black-Box Classifier , 2016, 2016 IEEE 2nd International Conference on Collaboration and Internet Computing (CIC).

[3]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[4]  Suresh Venkatasubramanian,et al.  Auditing black-box models for indirect influence , 2016, Knowledge and Information Systems.

[5]  Jitendra Malik,et al.  Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[7]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[8]  Andrea Vedaldi,et al.  Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[10]  Jitendra Malik,et al.  Simultaneous Detection and Segmentation , 2014, ECCV.

[11]  Nuria Oliver,et al.  The Tyranny of Data? The Bright and Dark Sides of Data-Driven Decision-Making for Social Good , 2016, ArXiv.

[12]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[13]  Quanshi Zhang,et al.  Interpretable Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Yair Zick,et al.  Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[16]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[17]  Biing-Hwang Juang,et al.  The segmental K-means algorithm for estimating parameters of hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[18]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[19]  Nicholas Diakopoulos Enabling Accountability of Algorithmic Media: Transparency as a Constructive and Critical Lens , 2017 .