A Hybrid Training Algorithm for Recurrent Neural Network Using Particle Swarm Optimization-Based Preprocessing and Temporal Error Aggregation

Recurrent neural network has been widely used as auto-regressive model for time series. The most commonly used training method for recurrent neural network is back propagation. However, recurrent neural networks trained with back propagation can get trapped at local minima and saddle points. In these cases, auto-regressive models cannot effectively model time series patterns. In order to address these problems, we propose a hybrid recurrent neural network training algorithm that consists of two phases: exploration and exploitation. Exploration phase uses synchronous particle swarm optimization to search for parameter settings with high activation score and low error. The results of exploration phase are trained with proposed enhanced back propagation, an improved algorithm over traditional back propagation that aggregates temporal errors across timestamps, in exploitation phase. We evaluate our proposed methods using four real-world datasets. Our proposed algorithm, applied to both regularized and adaptive momentum back propagation, increases convergence speed by 10% to 20% and reduces testing mean square error(MSE) at convergence by 5% to 30%. Using particle swarm optimization and activation list in exploration phase, the hybrid training algorithm reduces testing MSEs by more than 30% at convergence compared with traditional back propagation.

[1]  Patrick Siarry,et al.  Particle swarm and ant colony algorithms hybridized for improved continuous optimization , 2007, Appl. Math. Comput..

[2]  Pablo Laguna,et al.  A database for evaluation of algorithms for measurement of QT and other waveform intervals in the ECG , 1997, Computers in Cardiology 1997.

[3]  Heikki Hyotyniemi,et al.  Turing Machines Are Recurrent Neural Networks , 1996 .

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  Surya Ganguli,et al.  Analyzing noise in autoencoders and deep networks , 2014, ArXiv.

[6]  Seppo J. Ovaska,et al.  A modified Elman neural network model with application to dynamical systems identification , 1996, 1996 IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems (Cat. No.96CH35929).

[7]  Tommy W. S. Chow,et al.  Feedforward networks training speed enhancement by optimal initialization of the synaptic coefficients , 2001, IEEE Trans. Neural Networks.

[8]  Herbert Jaeger,et al.  The''echo state''approach to analysing and training recurrent neural networks , 2001 .

[9]  Michael R. Lyu,et al.  A hybrid particle swarm optimization-back-propagation algorithm for feedforward neural network training , 2007, Appl. Math. Comput..

[10]  I. Dzananovic,et al.  Recurent Neural Network as a Tool for Parameter Anomaly Detection in Thermal Power Plant , 2015 .

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  Yuhui Shi,et al.  Particle swarm optimization: developments, applications and resources , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[13]  D. Mangano Monitoring Pulmonary Arterial Pressure in Coronary‐artery Disease , 1980, Anesthesiology.

[14]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[15]  U. R. Prasad,et al.  Back propagation through adjoints for the identification of nonlinear dynamic systems using recurrent neural models , 1994, IEEE Trans. Neural Networks.

[16]  Michael I. Jordan Serial Order: A Parallel Distributed Processing Approach , 1997 .