Measuring the tendency of CNNs to Learn Surface Statistical Regularities

Deep CNNs are known to exhibit the following peculiarity: on the one hand they generalize extremely well to a test set, while on the other hand they are extremely sensitive to so-called adversarial perturbations. The extreme sensitivity of high performance CNNs to adversarial examples casts serious doubt that these networks are learning high level abstractions in the dataset. We are concerned with the following question: How can a deep CNN that does not learn any high level semantics of the dataset manage to generalize so well? The goal of this article is to measure the tendency of CNNs to learn surface statistical regularities of the dataset. To this end, we use Fourier filtering to construct datasets which share the exact same high level abstractions but exhibit qualitatively different surface statistical regularities. For the SVHN and CIFAR-10 datasets, we present two Fourier filtered variants: a low frequency variant and a randomly filtered variant. Each of the Fourier filtering schemes is tuned to preserve the recognizability of the objects. Our main finding is that CNNs exhibit a tendency to latch onto the Fourier image statistics of the training dataset, sometimes exhibiting up to a 28% generalization gap across the various test sets. Moreover, we observe that significantly increasing the depth of a network has a very marginal impact on closing the aforementioned generalization gap. Thus we provide quantitative evidence supporting the hypothesis that deep CNNs tend to learn surface statistical regularities in the dataset rather than higher-level abstract concepts.

[1]  R. Galper,et al.  Recognition of faces in photographic negative , 1970 .

[2]  R. Phillips Why are faces hard to recognize in photographic negative? , 1972 .

[3]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[4]  D.E. Pearson,et al.  Visual communication at very low data rates , 1985, Proceedings of the IEEE.

[5]  M C Morrone,et al.  Recognition of Positive and Negative Bandpass-Filtered Images , 1986, Perception.

[6]  G. J. Burton,et al.  Color and spatial structure in natural scenes. , 1987, Applied optics.

[7]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[8]  E. Brigham,et al.  The fast Fourier transform and its applications , 1988 .

[9]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[10]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[11]  Antonio Torralba,et al.  Statistics of natural image categories , 2003, Network.

[12]  Jelena Kovacevic,et al.  An Introduction to Frames , 2008, Found. Trends Signal Process..

[13]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[14]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[15]  Nicolas Pinto,et al.  Comparing state-of-the-art visual features on invariant object recognition tasks , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[16]  Rama Chellappa,et al.  Domain adaptation for object recognition: An unsupervised approach , 2011, 2011 International Conference on Computer Vision.

[17]  Johannes Stallkamp,et al.  Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition , 2012, Neural Networks.

[18]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[19]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[20]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Luca Rigazio,et al.  Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[22]  Ling Shao,et al.  Transfer Learning for Visual Categorization: A Survey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[24]  Uri Shaham,et al.  Understanding Adversarial Training: Increasing Local Stability of Neural Nets through Robust Optimization , 2015, ArXiv.

[25]  Shin Ishii,et al.  Distributional Smoothing with Virtual Adversarial Training , 2015, ICLR 2016.

[26]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[28]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[29]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[32]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[33]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[34]  Ronald M. Summers,et al.  Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning , 2016, IEEE Transactions on Medical Imaging.

[35]  Moustapha Cissé,et al.  Parseval Networks: Improving Robustness to Adversarial Examples , 2017, ICML.

[36]  Razvan Pascanu,et al.  Discovering objects and their relations from entangled scene representations , 2017, ICLR.

[37]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Universal Adversarial Perturbations , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Yoshua Bengio The Consciousness Prior , 2017, ArXiv.

[39]  Philippe Beaudoin,et al.  Independently Controllable Factors , 2017, ArXiv.

[40]  Demis Hassabis,et al.  SCAN: Learning Abstract Hierarchical Compositional Visual Concepts , 2017, ArXiv.

[41]  Radha Poovendran,et al.  Deep Neural Networks Do Not Recognize Negative Images , 2017, ArXiv.

[42]  Tom Schaul,et al.  Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.

[43]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[44]  Pascal Frossard,et al.  Analysis of classifiers’ robustness to adversarial perturbations , 2015, Machine Learning.

[45]  Terrance E. Boult,et al.  Towards Robust Deep Neural Networks with BANG , 2016, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).