暂无分享，去创建一个

[1] J. Nocedal. Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[2] Yann LeCun,et al. Improving the convergence of back-propagation learning with second-order methods , 1989 .

[3] Barak A. Pearlmutter,et al. Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors , 1992, NIPS 1992.

[4] A. Genz. Methods for Generating Random Orthogonal Matrices , 2000 .

[5] Simon Günter,et al. A Stochastic Quasi-Newton Method for Online Convex Optimization , 2007, AISTATS.

[6] Patrick Gallinari,et al. SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent , 2009, J. Mach. Learn. Res..

[7] Quoc V. Le,et al. On optimization methods for deep learning , 2011, ICML.

[8] O. Chapelle. Improved Preconditioner for Hessian Free Optimization , 2011 .

[9] Charles A. Bouman,et al. The Sparse Matrix Transform for Covariance Estimation and Analysis of High Dimensional Signals , 2011, IEEE Transactions on Image Processing.

[10] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.

[11] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks.

[12] Ilya Sutskever,et al. Estimating the Hessian by Back-propagating Curvature , 2012, ICML.