On Lipschitz Bounds of General Convolutional Neural Networks

Many convolutional neural networks (CNN’s) have a feed-forward structure. In this paper, we model a general framework for analyzing the Lipschitz bounds of CNN’s and propose a linear program that estimates these bounds. Several CNN’s, including the scattering networks, the AlexNet and the GoogleNet, are studied numerically. In these practical numerical examples, estimations of local Lipschitz bounds are compared to these theoretical bounds. Based on the Lipschitz bounds, we next establish concentration inequalities for the output distribution with respect to a stationary random input signal. The Lipschitz bound is further used to perform nonlinear discriminant analysis that measures the separation between features of different classes.

[1]  M. Talagrand,et al.  Probability in Banach spaces , 1991 .

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[4]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[5]  Yu-Hen Hu Neural Networks for Signal Processing IX : proceedings of the 1999 IEEE Signal Processing Society Workshop , 1999 .

[6]  M. Ledoux The concentration of measure phenomenon , 2001 .

[7]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[8]  C. Heil Harmonic Analysis and Applications , 2006 .

[9]  Y. Sinai,et al.  Theory of probability and random processes , 2007 .

[10]  Stéphane Mallat,et al.  Group Invariant Scattering , 2011, ArXiv.

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Stéphane Mallat,et al.  Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.

[13]  Panos M. Pardalos,et al.  Linear Discriminant Analysis , 2013 .

[14]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[15]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[16]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[18]  Helmut Bölcskei,et al.  Deep convolutional neural networks based on semi-discrete frames , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[19]  Patrick P. K. Chan,et al.  Adversarial Feature Selection Against Evasion Attacks , 2016, IEEE Transactions on Cybernetics.

[20]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[21]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[22]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Universal Adversarial Perturbations , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Maneesh Kumar Singh,et al.  Lipschitz Properties for Deep Convolutional Networks , 2017, ArXiv.

[24]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[25]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[26]  Kevin Scaman,et al.  Lipschitz regularity of deep neural networks: analysis and efficient estimation , 2018, NeurIPS.

[27]  Adam M. Oberman,et al.  Lipschitz regularized Deep Neural Networks generalize and are adversarially robust , 2018 .

[28]  Thomas Wiatowski,et al.  A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction , 2015, IEEE Transactions on Information Theory.

[29]  Andrea Montanari,et al.  A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.

[30]  Arthur Jacot,et al.  Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.

[31]  W. Czaja,et al.  Analysis of time-frequency scattering transforms , 2016, Applied and Computational Harmonic Analysis.

[32]  Arthur Jacot,et al.  Freeze and Chaos for DNNs: an NTK view of Batch Normalization, Checkerboard and Boundary Effects , 2019, ArXiv.

[33]  Kouichi Sakurai,et al.  One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.