A novel method for speed training acceleration of recurrent neural networks

Abstract Although recurrent neural networks (RNNs) perfectly solve many difficult problems, their computational complexity significantly increases training time. Therefore, the primary problem with applying RNNs is to shorten the time needed to train and operate a network. An effective solution to this problem is to use parallel processing. In the paper, a particular approach for the Jordan network will be shown, however, the presented idea is applicable to other RNN structures. This type of network is characterized by natural parallelism, and in the paper, this feature is used to significantly accelerate the learning process. High-performance learning has been achieved using a novel parallel three-dimensional architecture. The presented solutions can be implemented in digital hardware.

[1]  Zhang Yi,et al.  Convergence Analysis of Recurrent Neural Networks , 2003, Network Theory and Applications.

[2]  Rituparna Chaki,et al.  Exploring Neural Networks with C , 2014 .

[3]  Dilza Szwarcman,et al.  Speeding Up the Training of Neural Networks with CUDA Technology , 2012, ICAISC.

[4]  Hao Yu,et al.  Improved Computation for Levenberg–Marquardt Training , 2010, IEEE Transactions on Neural Networks.

[5]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[6]  Jaroslaw Bilski,et al.  Parallel Architectures for Learning the RTRN and Elman Dynamic Neural Networks , 2015, IEEE Transactions on Parallel and Distributed Systems.

[7]  Lipo Wang,et al.  Data Mining With Computational Intelligence , 2006, IEEE Transactions on Neural Networks.

[8]  Henrik Saxén,et al.  A Recurrent Neural Network Model , 1992 .

[9]  Simon Haykin,et al.  Neural Networks and Learning Machines , 2010 .

[10]  Hao Yu,et al.  Neural Network Learning Without Backpropagation , 2010, IEEE Transactions on Neural Networks.

[11]  Michael I. Jordan Attractor dynamics and parallelism in a connectionist sequential machine , 1990 .

[12]  Jaroslaw Bilski,et al.  Parallel Realisation of the Recurrent RTRN Neural Network Learning , 2008, ICAISC.

[13]  Yutaka Maeda,et al.  Simultaneous perturbation learning rule for recurrent neural networks and its FPGA implementation , 2005, IEEE Transactions on Neural Networks.

[14]  Stephan Stilkerich,et al.  Graph theoretical representation of ANN architectures on regular two-dimensional grids for VLSI implementations , 2007, Neurocomputing.

[15]  Jinde Cao,et al.  On the Topological Properties of the Certain Neural Networks , 2018, J. Artif. Intell. Soft Comput. Res..

[16]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[17]  Danilo P. Mandic,et al.  Recurrent Neural Networks for Prediction , 2001 .

[18]  Udo Seiffert,et al.  Artificial Neural Networks on Massively Parallel Computer Hardware , 2004, ESANN.

[19]  Simone A. Ludwig Applying a Neural Network Ensemble to Intrusion Detection , 2019, J. Artif. Intell. Soft Comput. Res..

[20]  Jinde Cao,et al.  Synchronization Analysis of Inertial Memristive Neural Networks with Time-Varying Delays , 2018, J. Artif. Intell. Soft Comput. Res..

[21]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[22]  Erol Egrioglu,et al.  An ARMA Type Pi-Sigma Artificial Neural Network for Nonlinear Time Series Forecasting , 2018, J. Artif. Intell. Soft Comput. Res..

[23]  Leszek Rutkowski,et al.  Computational intelligence - methods and techniques , 2008 .

[24]  Gérard Dreyfus,et al.  Neural networks - methodology and applications , 2005 .