Theoretical Analysis of Learning Speed in Gradient Descent Algorithm Replacing Derivative with Constant
暂无分享,去创建一个
[1] Michael Biehl,et al. Learning by on-line gradient descent , 1995 .
[2] Masato Okada,et al. Statistical Mechanics of On-line Node-perturbation Learning , 2011 .
[3] O. Kinouchi,et al. Optimal generalization in perceptions , 1992 .
[4] M. Rattray,et al. Incorporating curvature information into on-line learning , 1999 .
[5] R. Palmer,et al. Introduction to the theory of neural computation , 1994, The advanced book program.
[6] Barak A. Pearlmutter,et al. Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors , 1992, NIPS 1992.
[7] Anders Krogh,et al. Introduction to the theory of neural computation , 1994, The advanced book program.
[8] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[9] Saad,et al. On-line learning in soft committee machines. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.
[10] Christopher K. I. Williams. Computation with Infinite Neural Networks , 1998, Neural Computation.
[11] Kenji Fukumizu,et al. A Regularity Condition of the Information Matrix of a Multilayer Perceptron Network , 1996, Neural Networks.