Natural Gradient Descent for Training Multi-Layer Perceptrons
暂无分享,去创建一个
[1] G. Stewart. Introduction to matrix computations , 1973 .
[2] Shun-ichi Amari,et al. Differential-geometrical methods in statistics , 1985 .
[3] M. Kendall,et al. Kendall's advanced theory of statistics , 1995 .
[4] Robert A. Jacobs,et al. Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.
[5] John E. Moody,et al. Towards Faster Stochastic Gradient Search , 1991, NIPS.
[6] M. Murray,et al. Differential Geometry and Statistics , 1993 .
[7] Shun-ichi Amari,et al. Statistical Theory of Learning Curves under Entropic Loss Criterion , 1993, Neural Computation.
[8] Jean-François Cardoso,et al. Equivariant adaptive source separation , 1996, IEEE Trans. Signal Process..
[9] Shun-ichi Amari,et al. Neural Learning in Structured Parameter Spaces - Natural Riemannian Gradient , 1996, NIPS.
[10] Klaus-Robert Müller,et al. Asymptotic statistical theory of overtraining and cross-validation , 1997, IEEE Trans. Neural Networks.
[11] S. Amari. Natural Gradient Works Eciently in Learning , 2022 .