暂无分享,去创建一个
Kenneth E. Barner | Xin Guo | Xinjie Lan | K. Barner | Xinjie Lan | Xin Guo
[1] Pierre Alquier,et al. On the properties of variational approximations of Gibbs posteriors , 2015, J. Mach. Learn. Res..
[2] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[3] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.
[4] Ryan P. Adams,et al. Non-vacuous Generalization Bounds at the ImageNet Scale: a PAC-Bayesian Compression Approach , 2018, ICLR.
[5] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6] François Laviolette,et al. PAC-Bayesian learning of linear classifiers , 2009, ICML '09.
[7] David A. McAllester. Some PAC-Bayesian Theorems , 1998, COLT' 98.
[8] Leslie Pack Kaelbling,et al. Generalization in Deep Learning , 2017, ArXiv.
[9] Max Tegmark,et al. Why Does Deep and Cheap Learning Work So Well? , 2016, Journal of Statistical Physics.
[10] O. Catoni. PAC-BAYESIAN SUPERVISED CLASSIFICATION: The Thermodynamics of Statistical Learning , 2007, 0712.0248.
[11] Gintare Karolina Dziugaite,et al. Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data , 2017, UAI.
[12] Ben London,et al. A PAC-Bayesian Analysis of Randomized Learning with Application to Stochastic Gradient Descent , 2017, NIPS.
[13] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.
[14] David M. Blei,et al. Variational Inference: A Review for Statisticians , 2016, ArXiv.
[15] J. Zico Kolter,et al. Deterministic PAC-Bayesian generalization bounds for deep networks via generalizing noise-resilience , 2019, ICLR.
[16] Chong Wang,et al. Stochastic variational inference , 2012, J. Mach. Learn. Res..
[17] John Shawe-Taylor,et al. Tighter PAC-Bayes Bounds , 2006, NIPS.
[18] Roberto Battiti,et al. First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.
[19] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[20] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[21] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[22] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[23] Stefano Soatto,et al. Stochastic Gradient Descent Performs Variational Inference, Converges to Limit Cycles for Deep Networks , 2017, 2018 Information Theory and Applications Workshop (ITA).
[24] Roland Vollgraf,et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.
[25] Ning Qian,et al. On the momentum term in gradient descent learning algorithms , 1999, Neural Networks.
[26] David M. Blei,et al. Stochastic Gradient Descent as Approximate Bayesian Inference , 2017, J. Mach. Learn. Res..
[27] Fu Jie Huang,et al. A Tutorial on Energy-Based Learning , 2006 .
[28] David A. McAllester. PAC-Bayesian Stochastic Model Selection , 2003, Machine Learning.
[29] Elad Hoffer,et al. Train longer, generalize better: closing the generalization gap in large batch training of neural networks , 2017, NIPS.
[30] J. Zico Kolter,et al. Uniform convergence may be unable to explain generalization in deep learning , 2019, NeurIPS.
[31] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.
[32] David A. McAllester,et al. A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks , 2017, ICLR.
[33] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.
[34] David J. Schwab,et al. An exact mapping between the Variational Renormalization Group and Deep Learning , 2014, ArXiv.
[35] Lorenzo Rosasco,et al. Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review , 2016, International Journal of Automation and Computing.
[36] John Shawe-Taylor,et al. Tighter PAC-Bayes bounds through distribution-dependent priors , 2013, Theor. Comput. Sci..
[37] Alexandre Lacoste,et al. PAC-Bayesian Theory Meets Bayesian Inference , 2016, NIPS.
[38] Pascal Germain,et al. Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks , 2019, NeurIPS.
[39] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..