Revisiting the Importance of Individual Units in CNNs via Ablation

We revisit the importance of the individual units in Convolutional Neural Networks (CNNs) for visual recognition. By conducting unit ablation experiments on CNNs trained on large scale image datasets, we demonstrate that, though ablating any individual unit does not hurt overall classification accuracy, it does lead to significant damage on the accuracy of specific classes. This result shows that an individual unit is specialized to encode information relevant to a subset of classes. We compute the correlation between the accuracy drop under unit ablation and various attributes of an individual unit such as class selectivity and weight L1 norm. We confirm that unit attributes such as class selectivity are a poor predictor for impact on overall accuracy as found previously in recent work \cite{morcos2018importance}. However, our results show that class selectivity along with other attributes are good predictors of the importance of one unit to individual classes. We evaluate the impact of random rotation, batch normalization, and dropout to the importance of units to specific classes. Our results show that units with high selectivity play an important role in network classification power at the individual class level. Understanding and interpreting the behavior of these units is necessary and meaningful.

[1]  Jitendra Malik,et al.  Region-Based Convolutional Networks for Accurate Object Detection and Segmentation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[5]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[6]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[7]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[8]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[9]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Deborah Silver,et al.  Feature Visualization , 1994, Scientific Visualization.

[11]  Bolei Zhou,et al.  Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[12]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[13]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[14]  Song Han,et al.  DSD: Dense-Sparse-Dense Training for Deep Neural Networks , 2016, ICLR.

[15]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[19]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[20]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[21]  Hod Lipson,et al.  Convergent Learning: Do different neural networks learn the same representations? , 2015, FE@NIPS.

[22]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[23]  Matthew Botvinick,et al.  On the importance of single directions for generalization , 2018, ICLR.