Learned Deformation Stability in Convolutional Neural Networks

Conventional wisdom holds that interleaved pooling layers in convolutional neural networks lead to stability to small translations and deformations. In this work, we investigate this claim empirically. We find that while pooling confers stability to deformation at initialization, the deformation stability at each layer changes significantly over the course of training and even decreases in some layers, suggesting that deformation stability is not unilaterally helpful. Surprisingly, after training, the pattern of deformation stability across layers is largely independent of whether or not pooling was present. We then show that a significant factor in determining deformation stability is filter smoothness. Moreover, filter smoothness and deformation stability are not simply a consequence of the distribution of input images, but depend crucially on the joint distribution of images and labels. This work demonstrates a way in which biases such as deformation stability can in fact be learned and provides an example of understanding how a simple property of learned network weights contributes to the overall network computation.

[1]  Matthew Botvinick,et al.  On the importance of single directions for generalization , 2018, ICLR.

[2]  D. Sculley,et al.  Winner's Curse? On Pace, Progress, and Empirical Rigor , 2018, ICLR.

[3]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Geometric Robustness of Deep Networks: Analysis and Improvement , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Julien Mairal,et al.  Group Invariance and Stability to Deformations of Deep Convolutional Representations , 2017, ArXiv.

[5]  Stefano Soatto,et al.  Emergence of invariance and disentangling in deep representations , 2017 .

[6]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[7]  Mark D. McDonnell,et al.  Understanding Data Augmentation for Classification: When to Warp? , 2016, 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[8]  Stéphane Mallat,et al.  Understanding deep convolutional networks , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[11]  D. Sculley,et al.  Hidden Technical Debt in Machine Learning Systems , 2015, NIPS.

[12]  Pascal Frossard,et al.  Manitest: Are classifiers really invariant? , 2015, BMVC.

[13]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[14]  Andrea Vedaldi,et al.  Understanding Image Representations by Measuring Their Equivariance and Equivalence , 2014, International Journal of Computer Vision.

[15]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16]  Yair Weiss,et al.  Learning the Local Statistics of Optical Flow , 2013, NIPS.

[17]  Stéphane Mallat,et al.  Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Luca Maria Gambardella,et al.  Fast image scanning with deep max-pooling convolutional neural networks , 2013, 2013 IEEE International Conference on Image Processing.

[19]  Stéphane Mallat,et al.  Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Stéphane Mallat,et al.  Group Invariant Scattering , 2011, ArXiv.

[22]  Luca Maria Gambardella,et al.  Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.

[23]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Michael J. Black,et al.  On the Spatial Statistics of Optical Flow , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[25]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[26]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[27]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[28]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[29]  Kunihiko Fukushima,et al.  Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Visual Pattern Recognition , 1982 .

[30]  Jean Duchon,et al.  Splines minimizing rotation-invariant semi-norms in Sobolev spaces , 1976, Constructive Theory of Functions of Several Variables.

[31]  D. Hubel,et al.  Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.