Fast Exact Multiplication by the Hessian
暂无分享,去创建一个
[1] John E. Moody,et al. The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.
[2] Marwan A. Jabri,et al. Weight perturbation: an optimal architecture and learning technique for analog VLSI feedforward and recurrent multilayer networks , 1992, IEEE Trans. Neural Networks.
[3] Marwan A. Jabri,et al. Summed Weight Neuron Perturbation: An O(N) Improvement Over Weight Perturbation , 1992, NIPS.
[4] Christopher M. Bishop,et al. Current address: Microsoft Research, , 2022 .
[5] M. F. Møller,et al. Exact Calculation of the Product of the Hessian Matrix of Feed-Forward Network Error Functions and a Vector in 0(N) Time , 1993 .
[6] Maureen Caudill,et al. IEEE First International Conference on Neural Networks : Sheraton Harbor Island East, San Diego, California, June 21-24, 1987 , 1987 .
[7] Martin Fodslette Møller,et al. Supervised Learning On Large Redundant Training Sets , 1993, Int. J. Neural Syst..
[8] B. Widrow,et al. Stationary and nonstationary learning characteristics of the LMS adaptive filter , 1976, Proceedings of the IEEE.
[9] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[10] Bruce Christianson,et al. Geometric approach to Fletcher's ideal penalty function , 1995 .
[11] Martin Fodslette Meiller. A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1993 .
[12] Raymond L. Watrous. Learning Algorithms for Connectionist Networks: Applied Gradient Methods of Nonlinear Optimization , 1988 .
[13] Geoffrey E. Hinton,et al. A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..
[14] Gert Cauwenberghs,et al. A Fast Stochastic Error-Descent Algorithm for Supervised Learning and Optimization , 1992, NIPS.
[15] Pineda,et al. Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.
[16] Wray L. Buntine,et al. Computing second derivatives in feed-forward networks: a review , 1994, IEEE Trans. Neural Networks.
[17] J. Skilling. The Eigenvalues of Mega-dimensional Matrices , 1989 .
[18] Kurt W. Fleischer,et al. Analog VLSI Implementation of Gradient Descent , 1992, NIPS.
[19] P. J. Werbos,et al. Backpropagation: past and future , 1988, IEEE 1988 International Conference on Neural Networks.
[20] Geoffrey E. Hinton. Connectionist Learning Procedures , 1989, Artif. Intell..
[21] Yann LeCun,et al. Improving the convergence of back-propagation learning with second-order methods , 1989 .
[22] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[23] Bruce Christianson,et al. Automatic Hessians by reverse accumulation , 1992 .
[24] Luís B. Almeida,et al. A learning rule for asynchronous perceptrons with feedback in a combinatorial environment , 1990 .
[25] Yann LeCun,et al. Second Order Properties of Error Surfaces: Learning Time and Generalization , 1990, NIPS 1990.
[26] Peter M. Williams,et al. Bayesian Regularization and Pruning Using a Laplace Prior , 1995, Neural Computation.
[27] Barak A. Pearlmutter,et al. Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors , 1992, NIPS 1992.
[28] Chris Bishop,et al. Exact Calculation of the Hessian Matrix for the Multilayer Perceptron , 1992, Neural Computation.
[29] D. Mackay,et al. A Practical Bayesian Framework for Backprop Networks , 1991 .
[30] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.
[31] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[32] David J. C. MacKay,et al. A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.
[33] Barak A. Pearlmutter. Gradient Descent: Second Order Momentum and Saturating Error , 1991, NIPS.
[34] Martin Fodslette Møller,et al. A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.
[35] Ron Meir,et al. A Parallel Gradient Descent Method for Learning in Analog VLSI Neural Networks , 1992, NIPS.