暂无分享,去创建一个
[1] John E. Dennis,et al. Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.
[2] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[3] Chao Yang,et al. ARPACK users' guide - solution of large-scale eigenvalue problems with implicitly restarted Arnoldi methods , 1998, Software, environments, tools.
[4] H. Robbins. A Stochastic Approximation Method , 1951 .
[5] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[6] Surya Ganguli,et al. On the saddle point problem for non-convex optimization , 2014, ArXiv.
[7] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[8] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[9] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Yann LeCun,et al. Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond , 2016, 1611.07476.
[11] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[12] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[13] Stefano Soatto,et al. Entropy-SGD: biasing gradient descent into wide valleys , 2016, ICLR.
[14] Yann Dauphin,et al. Empirical Analysis of the Hessian of Over-Parametrized Neural Networks , 2017, ICLR.
[15] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[16] Sanjeev Arora,et al. On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization , 2018, ICML.
[17] Andrew M. Saxe,et al. High-dimensional dynamics of generalization error in neural networks , 2017, Neural Networks.