暂无分享,去创建一个
Jian Li | Mingda Qiao | Xuanyuan Luo | Mingda Qiao | Jian Li | Xuanyuan Luo
[1] Gintare Karolina Dziugaite,et al. Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data , 2017, UAI.
[2] Stefano Soatto,et al. Entropy-SGD: biasing gradient descent into wide valleys , 2016, ICLR.
[3] Shiliang Sun,et al. PAC-Bayes bounds for stable algorithms with instance-dependent priors , 2018, NeurIPS.
[4] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[5] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[6] G. Menz,et al. Poincaré and logarithmic Sobolev inequalities by decomposition of the energy landscape , 2012, 1202.1510.
[7] Yuchen Zhang,et al. A Hitting Time Analysis of Stochastic Gradient Langevin Dynamics , 2017, COLT.
[8] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[9] Jinghui Chen,et al. Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization , 2017, NeurIPS.
[10] M. Ledoux,et al. Analysis and Geometry of Markov Diffusion Operators , 2013 .
[11] Yee Whye Teh,et al. Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.
[12] G. Pavliotis. Stochastic Processes and Applications: Diffusion Processes, the Fokker-Planck and Langevin Equations , 2014 .
[13] Ruosong Wang,et al. On Exact Computation with an Infinitely Wide Neural Net , 2019, NeurIPS.
[14] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[15] Maxim Raginsky,et al. Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability , 2018, COLT.
[16] Colin Wei,et al. Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel , 2018, NeurIPS.
[17] Tomaso A. Poggio,et al. Fisher-Rao Metric, Geometry, and Complexity of Neural Networks , 2017, AISTATS.
[18] A. Bovier,et al. Metastability in reversible diffusion processes II. Precise asymptotics for small eigenvalues , 2005 .
[19] S. Sharma,et al. The Fokker-Planck Equation , 2010 .
[20] Matus Telgarsky,et al. Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis , 2017, COLT.
[21] A. Bovier. Metastability: A Potential-Theoretic Approach , 2016 .
[22] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[23] Gintare Karolina Dziugaite,et al. Entropy-SGD optimizes the prior of a PAC-Bayes bound: Generalization properties of Entropy-SGD and data-dependent priors , 2017, ICML.
[24] Ben London,et al. A PAC-Bayesian Analysis of Randomized Learning with Application to Stochastic Gradient Descent , 2017, NIPS.
[25] Bin Yu,et al. Stability and Convergence Trade-off of Iterative Optimization Algorithms , 2018, ArXiv.
[26] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .
[27] Ryota Tomioka,et al. Norm-Based Capacity Control in Neural Networks , 2015, COLT.
[28] Yi Zhang,et al. Stronger generalization bounds for deep nets via a compression approach , 2018, ICML.
[29] John Shawe-Taylor,et al. Tighter PAC-Bayes bounds through distribution-dependent priors , 2013, Theor. Comput. Sci..
[30] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[31] Yuan Cao,et al. Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks , 2018, ArXiv.
[32] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..
[33] David A. McAllester,et al. A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks , 2017, ICLR.
[34] Qiang Liu,et al. On the Margin Theory of Feedforward Neural Networks , 2018, ArXiv.
[35] Kai Zheng,et al. Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints , 2017, COLT.
[36] D. Stroock,et al. Logarithmic Sobolev inequalities and stochastic Ising models , 1987 .
[37] Massimiliano Pontil,et al. Stability of Randomized Learning Algorithms , 2005, J. Mach. Learn. Res..
[38] Flemming Topsøe,et al. Some inequalities for information divergence and related measures of discrimination , 2000, IEEE Trans. Inf. Theory.
[39] Colin Wei,et al. Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation , 2019, NeurIPS.
[40] Boris Polyak. Some methods of speeding up the convergence of iteration methods , 1964 .
[41] Francis Bach,et al. On Lazy Training in Differentiable Programming , 2018, NeurIPS.
[42] Maxim Raginsky,et al. Information-theoretic analysis of generalization capability of learning algorithms , 2017, NIPS.
[43] Varun Jog,et al. Generalization Error Bounds for Noisy, Iterative Algorithms , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).
[44] Christoph H. Lampert,et al. Data-Dependent Stability of Stochastic Gradient Descent , 2017, ICML.
[45] Jan Vondrák,et al. High probability generalization bounds for uniformly stable algorithms with nearly optimal rate , 2019, COLT.
[46] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.
[47] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[48] Francis Bach,et al. A Note on Lazy Training in Supervised Differentiable Programming , 2018, ArXiv.
[49] Ben London. Generalization Bounds for Randomized Learning with Application to Stochastic Gradient Descent , 2016 .
[50] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.