What to remember: how memory order affects the performance of NARX neural networks

It has been shown that gradient-descent learning can be more effective in NARX networks than in other recurrent neural networks that have "hidden states" on problems such as grammatical inference and nonlinear system identification. For these problems, NARX neural networks can converge faster and generalize better. Part of the reason can be attributed to the embedded memory of NARX networks, which can reduce the network's sensitivity to long-term dependencies. In this paper, we explore experimentally the effect of the order of embedded memory of NARX networks on learning ability and generalization performance for the problems above. We show that the embedded memory plays a crucial role in learning and generalization. In particular, generalization performance could be seriously deficient if the embedded memory is either inadequate or unnecessary prodigal but is quite good if the order of the network is similar to that of the problem.

[1]  Hong-Te Su,et al.  Identification of Chemical Processes using Recurrent Networks , 1991, 1991 American Control Conference.

[2]  Sun-Yuan Kung,et al.  A delay damage model selection algorithm for NARX neural networks , 1997, IEEE Trans. Signal Process..

[3]  Hava T. Siegelmann,et al.  Computational capabilities of recurrent NARX neural networks , 1997, IEEE Trans. Syst. Man Cybern. Part B.

[4]  Giovanni Soda,et al.  Local Feedback Multilayered Networks , 1992, Neural Computation.

[5]  Stephen A. Billings,et al.  Properties of neural networks with applications to modelling non-linear dynamical systems , 1992 .

[6]  P. Werbos,et al.  Long-term predictions of chemical processes using recurrent neural networks: a parallel training approach , 1992 .

[7]  R. R. Leighton,et al.  The autoregressive backpropagation algorithm , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[8]  José Carlos Príncipe,et al.  The gamma model--A new neural model for temporal processing , 1992, Neural Networks.

[9]  Ah Chung Tsoi,et al.  FIR and IIR Synapses, a New Neural Network Architecture for Time Series Modeling , 1991, Neural Computation.

[10]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[11]  Peter Tiño,et al.  Learning long-term dependencies in NARX recurrent neural networks , 1996, IEEE Trans. Neural Networks.

[12]  S Z Qin,et al.  Comparison of four neural net learning methods for dynamic system identification , 1992, IEEE Trans. Neural Networks.

[13]  Kumpati S. Narendra,et al.  Identification and control of dynamical systems using neural networks , 1990, IEEE Trans. Neural Networks.

[14]  P. S. Sastry,et al.  Memory neuron networks for identification and control of dynamical systems , 1994, IEEE Trans. Neural Networks.

[15]  I. J. Leontaritis,et al.  Input-output parametric models for non-linear systems Part II: stochastic non-linear systems , 1985 .

[16]  Duc Truong Pham,et al.  Adaptive control of dynamic systems using neural networks , 1993, Proceedings of IEEE Systems Man and Cybernetics Conference - SMC.

[17]  Les E. Atlas,et al.  Recurrent Networks and NARMA Modeling , 1991, NIPS.

[18]  H. Akaike A new look at the statistical model identification , 1974 .

[19]  Stephen A. Billings,et al.  Non-linear system identification using neural networks , 1990 .

[20]  C. Lee Giles,et al.  Learning a class of large finite state machines with a recurrent neural network , 1995, Neural Networks.