Weight groupings in the training of recurrent networks

We use the block-diagonal matrix to approximate the Hessian matrix in the Levenberg Marquardt method for the training of recurrent neural networks. Substantial improvement of the training time over the original Levenberg Marquardt method is observed without degrading the generalization ability. Three weight grouping methods, correlation blocks, k-unit blocks and layer blocks were investigated and compared. Their computational complexity, approximation ability, and training time are analyzed.