A Convergence Result for Learning in Recurrent Neural Networks

We give a rigorous analysis of the convergence properties of a backpropagation algorithm for recurrent networks containing either output or hidden layer recurrence. The conditions permit data generated by stochastic processes with considerable dependence. Restrictions are offered that may help assure convergence of the network parameters to a local optimum, as some simulations illustrate.

[1]  Pineda,et al.  Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.

[2]  Fernando J. Pineda,et al.  GENERALIZATION OF BACKPROPAGATION TO RECURRENT AND HIGH-ORDER NETWORKS. , 1987 .

[3]  Donald W. K. Andrews NON-STRONG MIXING AUTOREGRESSIVE PROCESSES , 1984 .

[4]  D. McLeish A Maximal Inequality and Dependent Strong Laws , 1975 .

[5]  Luís B. Almeida,et al.  A learning rule for asynchronous perceptrons with feedback in a combinatorial environment , 1990 .

[6]  G. Pflug Stochastic Approximation Methods for Constrained and Unconstrained Systems - Kushner, HJ.; Clark, D.S. , 1980 .

[7]  H. White,et al.  ADAPTIVE LEARNING WITH NONLINEAR DYNAMICS DRIVEN BY DEPENDENT PROCESSES , 1994 .

[8]  Fernando J. Pineda,et al.  Generalization of Back propagation to Recurrent and Higher Order Neural Networks , 1987, NIPS.

[9]  Donald W. K. Andrews,et al.  Non-strong mixing autoregressive processes , 1984, Journal of Applied Probability.

[10]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[11]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[12]  M. B. Priestley,et al.  Non-linear and non-stationary time series analysis , 1990 .

[13]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[14]  C. Granger,et al.  An introduction to bilinear time series models , 1979 .

[15]  P. Billingsley,et al.  Convergence of Probability Measures , 1969 .

[16]  H. White,et al.  A Unified Theory of Estimation and Inference for Nonlinear Dynamic Models , 1988 .

[17]  Lennart Ljung,et al.  Theory and Practice of Recursive Identification , 1983 .

[18]  H. White Some Asymptotic Results for Learning in Single Hidden-Layer Feedforward Network Models , 1989 .

[19]  Chung-Ming Kuan,et al.  Forecasting exchange rates using feedforward and recurrent neural networks , 1992 .

[20]  Harold J. Kushner,et al.  wchastic. approximation methods for constrained and unconstrained systems , 1978 .

[21]  Halbert White,et al.  Artificial neural networks: an econometric perspective ∗ , 1994 .

[22]  Chung-Ming Kuan A recurrent Newton algorithm and its convergence properties , 1995, IEEE Trans. Neural Networks.

[23]  Michael I. Jordan Constrained supervised learning , 1992 .

[24]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..