论文信息 - PAC-BAYESIAN MARGIN BOUNDS FOR CONVOLUTIONAL NEURAL NETWORKS

PAC-BAYESIAN MARGIN BOUNDS FOR CONVOLUTIONAL NEURAL NETWORKS

Recently the generalisation error of deep neural networks has been analysed through the PAC-Bayesian framework, for the case of fully connected layers. We adapt this approach to the convolutional setting.

[1] Pierre Vandergheynst,et al. FeTa: A DCA Pruning Algorithm with Generalization Error Guarantees , 2018, ArXiv.

[2] David A. McAllester,et al. A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks , 2017, ICLR.

[3] Joel A. Tropp,et al. User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[4] Peter L. Bartlett,et al. Nearly-tight VC-dimension and Pseudodimension Bounds for Piecewise Linear Neural Networks , 2017, J. Mach. Learn. Res..

[5] A. Bandeira,et al. Sharp nonasymptotic bounds on the norm of random matrices with independent entries , 2014, 1408.6185.

[6] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[7] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.

[8] Ruslan Salakhutdinov,et al. Geometry of Optimization and Implicit Regularization in Deep Learning , 2017, ArXiv.

[9] Xin Dong,et al. Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon , 2017, NIPS.

[10] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.

[11] Guillermo Sapiro,et al. Robust Large Margin Deep Neural Networks , 2017, IEEE Transactions on Signal Processing.

[12] Christoph H. Lampert,et al. Data-Dependent Stability of Stochastic Gradient Descent , 2017, ICML.

[13] Abbas Mehrabian,et al. Nearly-tight VC-dimension bounds for piecewise linear neural networks , 2017, COLT.

[14] Razvan Pascanu,et al. Sharp Minima Can Generalize For Deep Nets , 2017, ICML.

[15] David A. McAllester. Some PAC-Bayesian Theorems , 1998, COLT' 98.

[16] Yi Zhang,et al. Stronger generalization bounds for deep nets via a compression approach , 2018, ICML.

[17] J. Rissanen. A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[18] John Shawe-Taylor,et al. A PAC analysis of a Bayesian estimator , 1997, COLT '97.

[19] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..

[20] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.

[21] Roman Vershynin,et al. Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.