Is Robustness To Transformations Driven by Invariant Neural Representations?

Deep Convolutional Neural Networks (DCNNs) have demonstrated impressive robustness to recognize objects under transformations (e.g. blur or noise) when these transformations are included in the training set. A hypothesis to explain such robustness is that DCNNs develop invariant neural representations that remain unaltered when the image is transformed. Yet, to what extent this hypothesis holds true is an outstanding question, as including transformations in the training set could lead to properties different from invariance, e.g. parts of the network could be specialized to recognize either transformed or non-transformed images. In this paper, we analyze the conditions under which invariance emerges. To do so, we leverage that invariant representations facilitate robustness to transformations for object categories that are not seen transformed during training. Our results with state-of-the-art DCNNs indicate that invariant representations strengthen as the number of transformed categories in the training set is increased. This is much more prominent with local transformations such as blurring and high-pass filtering, compared to geometric transformations such as rotation and thinning, that entail changes in the spatial arrangement of the object. Our results contribute to a better understanding of invariant representations in deep learning, and the conditions under which invariance spontaneously emerges.

[1]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Arvind Satyanarayan,et al.  The Building Blocks of Interpretability , 2018 .

[3]  Quoc V. Le,et al.  Measuring Invariances in Deep Networks , 2009, NIPS.

[4]  Luis Perez,et al.  The Effectiveness of Data Augmentation in Image Classification using Deep Learning , 2017, ArXiv.

[5]  Matthias Bethge,et al.  Generalisation in humans and deep neural networks , 2018, NeurIPS.

[6]  Stefan Winkler,et al.  A data-driven approach to cleaning large face datasets , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[7]  Fabio Anselmi,et al.  Visual Cortex and Deep Networks: Learning Invariant Representations , 2016 .

[8]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[9]  Yair Weiss,et al.  Why do deep convolutional networks generalize so poorly to small image transformations? , 2018, J. Mach. Learn. Res..

[10]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[11]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[14]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[15]  Bolei Zhou,et al.  Revisiting the Importance of Individual Units in CNNs via Ablation , 2018, ArXiv.

[16]  C. Koch,et al.  Invariant visual representation by single neurons in the human brain , 2005, Nature.

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Stefano Soatto,et al.  Emergence of Invariance and Disentanglement in Deep Representations , 2017, 2018 Information Theory and Applications Workshop (ITA).

[19]  Aleksander Madry,et al.  Exploring the Landscape of Spatial Robustness , 2017, ICML.

[20]  Tomaso A. Poggio,et al.  Just One View: Invariances in Inferotemporal Cell Tuning , 1997, NIPS.

[21]  Xavier Boix,et al.  Minimal Images in Deep Neural Networks: Fragile Object Recognition in Natural Images , 2019, ICLR.

[22]  Albert Yonas,et al.  Potential downside of high initial visual acuity , 2018, Proceedings of the National Academy of Sciences.

[23]  Aleksander Madry,et al.  Adversarially Robust Generalization Requires More Data , 2018, NeurIPS.

[24]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.