An Algebraic Framework to Represent Finite State Machines in Single-Layer Recurrent Neural Networks

In this paper we present an algebraic framework to represent finite state machines (FSMs) in single-layer recurrent neural networks (SLRNNs), which unifies and generalizes some of the previous proposals. This framework is based on the formulation of both the state transition function and the output function of an FSM as a linear system of equations, and it permits an analytical explanation of the representational capabilities of first-order and higher-order SLRNNs. The framework can be used to insert symbolic knowledge in RNNs prior to learning from examples and to keep this knowledge while training the network. This approach is valid for a wide range of activation functions, whenever some stability conditions are met. The framework has already been used in practice in a hybrid method for grammatical inference reported elsewhere (Sanfeliu and Alquzar 1994).

[1]  Noga Alon,et al.  Efficient simulation of finite automata by neural nets , 1991, JACM.

[2]  M. Goudreau,et al.  First-order vs. Second-order Single Layer Recurrent Neural Networks , 1994 .

[3]  Padhraic Smyth,et al.  Learning Finite State Machines With Self-Clustering Recurrent Networks , 1993, Neural Computation.

[4]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[5]  James L. McClelland,et al.  Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[6]  C. L. Giles,et al.  On recurrent neural networks and representing finite-state recognizers , 1993 .

[7]  Marvin Minsky,et al.  Computation : finite and infinite machines , 2016 .

[8]  Alberto Sanfeliu,et al.  Representation and Recognition of Regular Grammars by Means of Second-Order Recurrent Neural Networks , 1993, IWANN.

[9]  C. Lee Giles,et al.  Training Second-Order Recurrent Neural Networks using Hints , 1992, ML.

[10]  Karvel K. Thornber,et al.  Representation of fuzzy finite state automata in continuous recurrent, neural networks , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[11]  Stefan C. Kremer,et al.  Comments on "Constructive learning of recurrent neural networks: limitations of recurrent cascade correlation and a simple solution" , 1996, IEEE Trans. Neural Networks.

[12]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[13]  Giovanni Soda,et al.  An unified approach for integrating explicit knowledge and learning by example in recurrent networks , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[14]  Alberto Sanfeliu,et al.  Active Grammatical Inference: A New Learning Methodology , 1994 .

[15]  C. Lee Giles,et al.  Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks , 1992, Neural Computation.

[16]  Srimat T. Chakradhar,et al.  First-order versus second-order single-layer recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[17]  Alberto Sanfeliu,et al.  Understanding Neural Networks for Grammatical Inference and Recognition , 1993 .

[18]  D. C. Cooper,et al.  Sequential Machines and Automata Theory , 1968, Comput. J..

[19]  Alberto Sanfeliu,et al.  Learning bidimensional context-dependent models using a context-sensitive language , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[20]  C. Lee Giles,et al.  Stable Encoding of Large Finite-State Automata in Recurrent Neural Networks with Sigmoid Discriminants , 1996, Neural Computation.