Efficient Saliency Maps for Explainable AI

We describe an explainable AI saliency map method for use with deep convolutional neural networks (CNN) that is much more efficient than popular fine-resolution gradient methods. It is also quantitatively similar or better in accuracy. Our technique works by measuring information at the end of each network scale which is then combined into a single saliency map. We describe how saliency measures can be made more efficient by exploiting Saliency Map Order Equivalence. We visualize individual scale/layer contributions by using a Layer Ordered Visualization of Information. This provides an interesting comparison of scale information contributions within the network not provided by other saliency map methods. Using our method instead of Guided Backprop, coarse-resolution class activation methods such as Grad-CAM and Grad-CAM++ seem to yield demonstrably superior results without sacrificing speed. This will make fine-resolution saliency methods feasible on resource limited platforms such as robots, cell phones, low-cost industrial devices, astronomy and satellite imagery.

[1]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[3]  Andrea Vedaldi,et al.  Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[5]  David Duvenaud,et al.  Explaining Image Classifiers by Adaptive Dropout and Generative In-filling , 2018, ArXiv.

[6]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[7]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Daniel Omeiza,et al.  Smooth Grad-CAM++: An Enhanced Inference Level Visualization Technique for Deep Convolutional Neural Network Models , 2019, ArXiv.

[9]  Vineeth N. Balasubramanian,et al.  Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[10]  Donald P. Greenberg,et al.  Color spaces for computer graphics , 1978, SIGGRAPH.

[11]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[14]  Dumitru Erhan,et al.  Evaluating Feature Importance Estimates , 2018, ArXiv.

[15]  S. C. Choi,et al.  Maximum Likelihood Estimation of the Parameters of the Gamma Distribution and Their Bias , 1969 .

[16]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[17]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[18]  Wesam A. Sakla,et al.  A Large Contextual Dataset for Classification, Detection and Counting of Cars with Deep Learning , 2016, ECCV.

[19]  Martin Wattenberg,et al.  SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[20]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[21]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[22]  Alexander Binder,et al.  Evaluating the Visualization of What a Deep Neural Network Has Learned , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Yarin Gal,et al.  Real Time Image Saliency for Black Box Classifiers , 2017, NIPS.

[24]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[25]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[26]  Vinay P. Namboodiri,et al.  U-CAM: Visual Explanation Using Uncertainty Based Class Activation Maps , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[28]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[29]  Jinwoo Shin,et al.  Training CNNs with Selective Allocation of Channels , 2019, ICML.

[30]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[31]  Xiaoou Tang,et al.  A large-scale car dataset for fine-grained categorization and verification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).