论文信息 - Wide stochastic networks: Gaussian limit and PAC-Bayesian training

Wide stochastic networks: Gaussian limit and PAC-Bayesian training

The limit of infinite width allows for substantial simplifications in the analytical study of overparameterized neural networks. With a suitable random initialization, an extremely large network is well approximated by a Gaussian process, both before and during training. In the present work, we establish a similar result for a simple stochastic architecture whose parameters are random variables. The explicit evaluation of the output distribution allows for a PAC-Bayesian training procedure that directly optimizes the generalization bound. For a large but finite-width network, we show empirically on MNIST that this training approach can outperform standard PAC-Bayesian methods.

Arnaud Doucet | George Deligiannidis | Eugenio Clerico

[1] Gintare Karolina Dziugaite,et al. Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data , 2017, UAI.

[2] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.

[3] Andreas Maurer,et al. A Note on the PAC Bayesian Theorem , 2004, ArXiv.

[4] Alain Durmus,et al. Quantitative Propagation of Chaos for SGD in Wide Neural Networks , 2020, NeurIPS.

[5] Richard E. Turner,et al. Gaussian Process Behaviour in Wide Deep Neural Networks , 2018, ICLR.

[6] François Laviolette,et al. PAC-Bayesian learning of linear classifiers , 2009, ICML '09.

[7] David A. McAllester. Some PAC-Bayesian Theorems , 1998, COLT' 98.

[8] Greg Yang,et al. Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes , 2019, NeurIPS.

[9] Konstantinos Spiliopoulos,et al. Mean Field Analysis of Neural Networks: A Law of Large Numbers , 2018, SIAM J. Appl. Math..

[10] Surya Ganguli,et al. Deep Information Propagation , 2016, ICLR.

[11] Csaba Szepesvari,et al. Tighter risk certificates for neural networks , 2020, J. Mach. Learn. Res..