论文信息 - Weight groupings in the training of recurrent networks

Weight groupings in the training of recurrent networks

We use the block-diagonal matrix to approximate the Hessian matrix in the Levenberg Marquardt method for the training of recurrent neural networks. Substantial improvement of the training time over the original Levenberg Marquardt method is observed without degrading the generalization ability. Three weight grouping methods, correlation blocks, k-unit blocks and layer blocks were investigated and compared. Their computational complexity, approximation ability, and training time are analyzed.

Lai-Wan Chan | Chi-Cheong Szeto | L. Chan | Chi-Cheong Szeto

[1] Lars Kai Hansen,et al. Recurrent Networks: Second Order Properties and Pruning , 1994, NIPS.

[2] Stefanos Kollias,et al. An adaptive least squares algorithm for the efficient training of artificial neural networks , 1989 .

[3] J. Wille,et al. On the structure of the Hessian matrix in feedforward networks and second derivative methods , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[4] S. Kollias,et al. Adaptive training of multilayer neural networks using a least squares estimation technique , 1988, IEEE 1988 International Conference on Neural Networks.

[5] Lai-Wan Chan,et al. Training recurrent network with block-diagonal approximated Levenberg-Marquardt algorithm , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[6] Yann LeCun,et al. Improving the convergence of back-propagation learning with second-order methods , 1989 .

[7] S. Ragazzini,et al. Learning of word stress in a sub-optimal second order back-propagation neural network , 1988, IEEE 1988 International Conference on Neural Networks.