EA-CG: An Approximate Second-Order Method for Training Fully-Connected Neural Networks
暂无分享,去创建一个
[2] Nicol N. Schraudolph,et al. Fast Curvature Matrix-Vector Products for Second-Order Gradient Descent , 2002, Neural Computation.
[3] John Wright,et al. Using negative curvature in solving nonlinear programs , 2017, Comput. Optim. Appl..
[4] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[5] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[7] Charles V. Stewart,et al. Robust Parameter Estimation in Computer Vision , 1999, SIAM Rev..
[8] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[9] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[10] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[11] Edward Y. Chang,et al. Artificial Intelligence in XPRIZE DeepQ Tricorder , 2017, MMHealth@MM.
[12] David Barber,et al. Practical Gauss-Newton Optimisation for Deep Learning , 2017, ICML.
[13] Roger B. Grosse,et al. A Kronecker-factored approximate Fisher matrix for convolution layers , 2016, ICML.
[14] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[15] Richard Socher,et al. Block-diagonal Hessian-free Optimization for Training Neural Networks , 2017, ArXiv.
[16] Nassir Navab,et al. Robust Optimization for Deep Regression , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[17] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[18] Stuart E. Dreyfus,et al. Second-order stagewise backpropagation for Hessian-matrix analyses and investigation of negative curvature , 2008, Neural Networks.
[19] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[20] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[21] Kenji Fukumizu,et al. Adaptive Method of Realizing Natural Gradient Learning for Multilayer Perceptrons , 2000, Neural Computation.
[22] Roger B. Grosse,et al. Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.