Efficient BackProp
暂无分享,去创建一个
Klaus-Robert Müller | Yann LeCun | Léon Bottou | Genevieve B. Orr | L. Bottou | Yann LeCun | G. Orr | K. Müller
[1] Shun-ichi Amari,et al. Complexity Issues in Natural Gradient Descent Method for Training Multilayer Perceptrons , 1998, Neural Computation.
[2] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[3] Shun-ichi Amari,et al. The Efficiency and the Robustness of Natural Gradient Descent Learning Rule , 1997, NIPS.
[4] Genevieve B. Orr,et al. Removing Noise in On-Line Search using Adaptive Batch Sizes , 1996, NIPS.
[5] Shun-ichi Amari,et al. Neural Learning in Structured Parameter Spaces - Natural Riemannian Gradient , 1996, NIPS.
[6] Andreas Ziehe,et al. Adaptive On-line Learning in Changing Environments , 1996, NIPS.
[7] Saad,et al. Exact solution for on-line learning in multilayer neural networks. , 1995, Physical review letters.
[8] Mark J. L. Orr,et al. Regularization in the Selection of Radial Basis Function Centers , 1995, Neural Computation.
[9] W. Wiegerinck,et al. Stochastic dynamics of learning with momentum in neural networks , 1994 .
[10] Wray L. Buntine,et al. Computing second derivatives in feed-forward networks: a review , 1994, IEEE Trans. Neural Networks.
[11] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[12] J. G. Taylor,et al. Mathematical Approaches to Neural Networks , 1993 .
[13] Martin Fodslette Møller,et al. Supervised Learning On Large Redundant Training Sets , 1993, Int. J. Neural Syst..
[14] Barak A. Pearlmutter,et al. Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors , 1992, NIPS 1992.
[15] Richard S. Sutton,et al. Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta , 1992, AAAI.
[16] Roberto Battiti,et al. First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.
[17] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.
[18] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.
[19] M. F. Møller. A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1990 .
[20] John E. Moody,et al. Note on Learning Rate Schedules for Stochastic Optimization , 1990, NIPS.
[21] John Moody,et al. Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.
[22] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..
[23] R. Fletcher. Practical Methods of Optimization , 1988 .
[24] R. Jacobs. Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.
[25] Yann LeCun. PhD thesis: Modeles connexionnistes de l'apprentissage (connectionist learning models) , 1987 .
[26] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[27] Vladimir Cherkassky,et al. Statistical learning theory , 1998 .
[28] Christopher M. Bishop,et al. Neural networks for pattern recognition , 1995 .
[29] Haim Sompolinsky,et al. On-line Learning of Dichotomies: Algorithms and Learning Curves. , 1995, NIPS 1995.
[30] Patrick van der Smagt. Minimisation methods for training feedforward neural networks , 1994, Neural Networks.
[31] Hilbert J. Kappen,et al. On-line learning processes in artificial neural networks , 1993 .
[32] Yann LeCun,et al. Second Order Properties of Error Surfaces , 1990, NIPS.
[33] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[34] Yann LeCun,et al. Generalization and network design strategies , 1989 .
[35] Alberto L. Sangiovanni-Vincentelli,et al. Efficient Parallel Learning Algorithms for Neural Networks , 1988, NIPS.
[36] G. Golub. Matrix computations , 1983 .