Computational capabilities of recurrent NARX neural networks

Recently, fully connected recurrent neural networks have been proven to be computationally rich-at least as powerful as Turing machines. This work focuses on another network which is popular in control applications and has been found to be very effective at learning a variety of problems. These networks are based upon Nonlinear AutoRegressive models with eXogenous Inputs (NARX models), and are therefore called NARX networks. As opposed to other recurrent networks, NARX networks have a limited feedback which comes only from the output neuron rather than from hidden states. They are formalized by y(t)=Psi(u(t-n(u)), ..., u(t-1), u(t), y(t-n(y)), ..., y(t-1)) where u(t) and y(t) represent input and output of the network at time t, n(u) and n(y) are the input and output order, and the function Psi is the mapping performed by a Multilayer Perceptron. We constructively prove that the NARX networks with a finite number of parameters are computationally as strong as fully connected recurrent networks and thus Turing machines. We conclude that in theory one can use the NARX models, rather than conventional recurrent networks without any computational loss even though their feedback is limited. Furthermore, these results raise the issue of what amount of feedback or recurrence is necessary for any network to be Turing equivalent and what restrictions on feedback limit computational power.

[1]  Peter Tiño,et al.  Learning long-term dependencies in NARX recurrent neural networks , 1996, IEEE Trans. Neural Networks.

[2]  Peter Tiňo,et al.  Learning long-term dependencies is not as difficult with NARX recurrent neural networks , 1995 .

[3]  Don R. Hush,et al.  Bounds on the complexity of recurrent neural network implementations of finite state machines , 1993, Neural Networks.

[4]  S Z Qin,et al.  Comparison of four neural net learning methods for dynamic system identification , 1992, IEEE Trans. Neural Networks.

[5]  J. Taylor,et al.  Switching and finite automata theory, 2nd ed. , 1980, Proceedings of the IEEE.

[6]  Hava T. Siegelmann,et al.  On the power of sigmoid neural networks , 1993, COLT '93.

[7]  C. Lee Giles,et al.  Learning a class of large finite state machines with a recurrent neural network , 1995, Neural Networks.

[8]  Hong-Te Su,et al.  Identification of Chemical Processes using Recurrent Networks , 1991, 1991 American Control Conference.

[9]  José Carlos Príncipe,et al.  The gamma model--A new neural model for temporal processing , 1992, Neural Networks.

[10]  Ah Chung Tsoi,et al.  FIR and IIR Synapses, a New Neural Network Architecture for Time Series Modeling , 1991, Neural Computation.

[11]  Noga Alon,et al.  Efficient simulation of finite automata by neural nets , 1991, JACM.

[12]  Hava T. Siegelmann,et al.  Analog computation via neural networks , 1993, [1993] The 2nd Israel Symposium on Theory and Computing Systems.

[13]  D. C. Cooper,et al.  Sequential Machines and Automata Theory , 1968, Comput. J..

[14]  Hava T. Siegelmann,et al.  The complexity of language recognition by neural networks , 1992, Neurocomputing.

[15]  Giovanni Soda,et al.  Local Feedback Multilayered Networks , 1992, Neural Computation.

[16]  I. J. Leontaritis,et al.  Input-output parametric models for non-linear systems Part II: stochastic non-linear systems , 1985 .

[17]  Richard D. Braatz,et al.  On the "Identification and control of dynamical systems using neural networks" , 1997, IEEE Trans. Neural Networks.

[18]  Eduardo Sontag,et al.  Turing computability with neural nets , 1991 .

[19]  Eduardo Sontag,et al.  Computational power of neural networks , 1995 .

[20]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[21]  Stephen A. Billings,et al.  Non-linear system identification using neural networks , 1990 .

[22]  C. Lee Giles,et al.  Constructing deterministic finite-state automata in recurrent neural networks , 1996, JACM.

[23]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[24]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[25]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[26]  C. Lee Giles,et al.  An experimental comparison of recurrent neural networks , 1994, NIPS.

[27]  P. Werbos,et al.  Long-term predictions of chemical processes using recurrent neural networks: a parallel training approach , 1992 .

[28]  Hava T. Siegelmann,et al.  On the Computational Power of Neural Nets , 1995, J. Comput. Syst. Sci..