论文信息 - Recursive Bayesian Recurrent Neural Networks for Time-Series Modeling

Recursive Bayesian Recurrent Neural Networks for Time-Series Modeling

This paper develops a probabilistic approach to recursive second-order training of recurrent neural networks (RNNs) for improved time-series modeling. A general recursive Bayesian Levenberg-Marquardt algorithm is derived to sequentially update the weights and the covariance (Hessian) matrix. The main strengths of the approach are a principled handling of the regularization hyperparameters that leads to better generalization, and stable numerical performance. The framework involves the adaptation of a noise hyperparameter and local weight prior hyperparameters, which represent the noise in the data and the uncertainties in the model parameters. Experimental investigations using artificial and real-world data sets show that RNNs equipped with the proposed approach outperform standard real-time recurrent learning and extended Kalman training algorithms for recurrent networks, as well as other contemporary nonlinear neural models, on time-series modeling.

Nikolay I. Nikolaev | Derrick Takeshi Mirikitani | N. Nikolaev | D. T. Mirikitani

[1] Francesco Piazza,et al. New second-order algorithms for recurrent neural networks based on conjugate gradient , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[2] Lee A. Feldkamp,et al. Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks , 1994, IEEE Trans. Neural Networks.

[3] Peter Tiño,et al. Financial volatility trading using recurrent neural networks , 2001, IEEE Trans. Neural Networks.

[4] Lizhong Wu,et al. A Smoothing Regularizer for Feedforward and Recurrent Neural Networks , 1996, Neural Computation.

[5] Ramazan Gençay,et al. Nonlinear modelling and prediction with feedforward and recurrent networks , 1997 .

[6] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..

[7] David J. C. MacKay,et al. A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[8] Kenneth Levenberg. A METHOD FOR THE SOLUTION OF CERTAIN NON – LINEAR PROBLEMS IN LEAST SQUARES , 1944 .

[9] Pedro Henrique Gouvea Coelho. An extended RTRL training algorithm using Hessian matrix , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[10] Jonas Sjöberg,et al. Efficient training of neural nets for nonlinear adaptive filtering using a recursive Levenberg-Marquardt algorithm , 2000, IEEE Trans. Signal Process..

[11] Hitoshi Iba,et al. Adaptive Learning of Polynomial Networks: Genetic Programming, Backpropagation and Bayesian Methods (Genetic and Evolutionary Computation) , 2006 .

[12] A Tikhonov,et al. Solution of Incorrectly Formulated Problems and the Regularization Method , 1963 .

[13] O. Rössler. An equation for hyperchaos , 1979 .

[14] Aniket A. Vartak,et al. On-line Gauss-Newton-based learning for fully recurrent neural networks , 2005 .

[15] C. Lee Giles,et al. Pruning recurrent neural networks for improved generalization performance , 1994, IEEE Trans. Neural Networks.

[16] Jukka Saarinen,et al. Time Series Prediction with Multilayer Perception, FIR and Elman Neural Networks , 1996 .

[17] Andrew Chi-Sing Leung,et al. Extended Kalman Filter-Based Pruning Method for Recurrent Neural Networks , 1998, Neural Comput..

[18] Hübner,et al. Dimensions and entropies of chaotic intensity pulsations in a single-mode far-infrared NH3 laser. , 1989, Physical review. A, General physics.

[19] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[20] Martin Casdagli,et al. Nonlinear prediction of chaotic time series , 1989 .

[21] K. Ikeda. Multiple-valued stationary state and its instability of the transmitted light by a ring cavity system , 1979 .

[22] Michael E. Tipping. Bayesian Inference: An Introduction to Principles and Practice in Machine Learning , 2003, Advanced Lectures on Machine Learning.

[23] P. Kumar,et al. Theory and practice of recursive identification , 1985, IEEE Transactions on Automatic Control.

[24] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .

[25] Nikolay I. Nikolaev,et al. Recursive Bayesian Levenberg-Marquardt Training of Recurrent Neural Networks , 2007, 2007 International Joint Conference on Neural Networks.

[26] Lie-Liang Yang,et al. Recurrent Neural Network Based Narrowband Channel Prediction , 2006, 2006 IEEE 63rd Vehicular Technology Conference.

[27] Ian T. Nabney,et al. Netlab: Algorithms for Pattern Recognition , 2002 .

[28] C. Lee Giles,et al. An analysis of noise in recurrent neural networks: convergence and generalization , 1996, IEEE Trans. Neural Networks.

[29] Lars Kai Hansen,et al. Recurrent Networks: Second Order Properties and Pruning , 1994, NIPS.

[30] E. Lorenz. Deterministic nonperiodic flow , 1963 .

[31] Ronald J. Williams,et al. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[32] Lai-Wan Chan,et al. Training recurrent network with block-diagonal approximated Levenberg-Marquardt algorithm , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[33] Afzel Noore,et al. Software Reliability Prediction Using Recurrent Neural Network With Bayesian Regularization , 2004, Int. J. Neural Syst..

[34] M. Hénon,et al. A two-dimensional mapping with a strange attractor , 1976 .

[35] A.H. Haddad,et al. Applied optimal estimation , 1976, Proceedings of the IEEE.

[36] Yves Grandvalet,et al. Outcomes of the Equivalence of Adaptive Ridge with Least Absolute Shrinkage , 1998, NIPS.

[37] S. Haykin,et al. Making sense of a complex world [chaotic events modeling] , 1998, IEEE Signal Process. Mag..

[38] Christopher M. Bishop,et al. Neural networks for pattern recognition , 1995 .

[39] D. Marquardt. An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .

[40] Gene H. Golub,et al. Matrix computations (3rd ed.) , 1996 .

[41] Gustavo Deco,et al. Neural learning of chaotic dynamics , 1995, Neural Processing Letters.

[42] Simon Haykin,et al. Making sense of a complex world , 1998 .

[43] Andrew Chi-Sing Leung,et al. A Local Training-Pruning Approach for Recurrent Neural Networks , 2003, Int. J. Neural Syst..

[44] William H. Swallow,et al. A review of the development and application of recursive residuals in linear models , 1996 .

[45] Hideaki Sakai,et al. A real-time learning algorithm for a multilayered neural network based on the extended Kalman filter , 1992, IEEE Trans. Signal Process..

[46] Robert H. Halstead,et al. Matrix Computations , 2011, Encyclopedia of Parallel Computing.