Efficient BackProp

[1]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[2]  Shun-ichi Amari,et al.  Complexity Issues in Natural Gradient Descent Method for Training Multilayer Perceptrons , 1998, Neural Computation.

[3]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[4]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[5]  Shun-ichi Amari,et al.  The Efficiency and the Robustness of Natural Gradient Descent Learning Rule , 1997, NIPS.

[6]  Genevieve B. Orr,et al.  Removing Noise in On-Line Search using Adaptive Batch Sizes , 1996, NIPS.

[7]  Andreas Ziehe,et al.  Adaptive On-line Learning in Changing Environments , 1996, NIPS.

[8]  Shun-ichi Amari,et al.  Neural Learning in Structured Parameter Spaces - Natural Riemannian Gradient , 1996, NIPS.

[9]  Saad,et al.  Exact solution for on-line learning in multilayer neural networks. , 1995, Physical review letters.

[10]  Mark J. L. Orr,et al.  Regularization in the Selection of Radial Basis Function Centers , 1995, Neural Computation.

[11]  Haim Sompolinsky,et al.  On-line Learning of Dichotomies: Algorithms and Learning Curves. , 1995, NIPS 1995.

[12]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[13]  W. Wiegerinck,et al.  Stochastic dynamics of learning with momentum in neural networks , 1994 .

[14]  Wray L. Buntine,et al.  Computing second derivatives in feed-forward networks: a review , 1994, IEEE Trans. Neural Networks.

[15]  Barak A. Pearlmutter Fast Exact Multiplication by the Hessian , 1994, Neural Computation.

[16]  Patrick van der Smagt Minimisation methods for training feedforward neural networks , 1994, Neural Networks.

[17]  J. G. Taylor,et al.  Mathematical Approaches to Neural Networks , 1993 .

[18]  Martin Fodslette Møller,et al.  A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[19]  Hilbert J. Kappen,et al.  On-line learning processes in artificial neural networks , 1993 .

[20]  Barak A. Pearlmutter,et al.  Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors , 1992, NIPS 1992.

[21]  M. Moller,et al.  Supervised learning on large redundant training sets , 1992, Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop.

[22]  Richard S. Sutton,et al.  Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta , 1992, AAAI.

[23]  Roberto Battiti,et al.  First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.

[24]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[25]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[26]  John E. Moody,et al.  Note on Learning Rate Schedules for Stochastic Optimization , 1990, NIPS.

[27]  Yann LeCun,et al.  Second Order Properties of Error Surfaces , 1990, NIPS.

[28]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[29]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[30]  Yann LeCun,et al.  Generalization and network design strategies , 1989 .

[31]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[32]  R. Fletcher Practical Methods of Optimization , 1988 .

[33]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[34]  Alberto L. Sangiovanni-Vincentelli,et al.  Efficient Parallel Learning Algorithms for Neural Networks , 1988, NIPS.

[35]  Yann LeCun PhD thesis: Modeles connexionnistes de l'apprentissage (connectionist learning models) , 1987 .

[36]  G. Golub Matrix computations , 1983 .