Modelling long range time dependencies in recurrent neural networks is a well known problem in the field. As previously established, the method of gradient descent fails due to the issue of exponentially decaying/exploding back propagated error. Moreover, gradient descent is not scale invariant in the parameters it seeks to optimize. Recently, a Hessian free approach for training recurrent neural nets was proposed to address these 2 issues. Although, successful to a large extent the problem of exploding/decaying gradients re-appears depending on the random initializations. We propose a collocation based scheme for training networks nets and report its performance for simple architectures. The training of recurrent neural networks based on the proposed scheme and comparison of its performance with the HF-based technique both on the rate of success and the use of computational resources is currently under study with no conclusive results yet.
[1]
The collocation method
,
1968
.
[2]
Geoffrey E. Hinton,et al.
Learning representations by back-propagating errors
,
1986,
Nature.
[3]
Kurt Hornik,et al.
Approximation capabilities of multilayer feedforward networks
,
1991,
Neural Networks.
[4]
Geoffrey E. Hinton,et al.
Generating Text with Recurrent Neural Networks
,
2011,
ICML.
[5]
Ilya Sutskever,et al.
Learning Recurrent Neural Networks with Hessian-Free Optimization
,
2011,
ICML.
[6]
James Martens,et al.
Deep learning via Hessian-free optimization
,
2010,
ICML.
[7]
George Cybenko,et al.
Approximation by superpositions of a sigmoidal function
,
1989,
Math. Control. Signals Syst..
[8]
Jürgen Schmidhuber,et al.
Long Short-Term Memory
,
1997,
Neural Computation.