Stable Encoding of Finite-State Machines in Discrete-Time Recurrent Neural Nets with Sigmoid Units

There has been a lot of interest in the use of discrete-time recurrent neural nets (DTRNN) to learn finite-state tasks, with interesting results regarding the induction of simple finite-state machines from inputoutput strings. Parallel work has studied the computational power of DTRNN in connection with finite-state computation. This article describes a simple strategy to devise stable encodings of finite-state machines in computationally capable discrete-time recurrent neural architectures with sigmoid units and gives a detailed presentation on how this strategy may be applied to encode a general class of finite-state machines in a variety of commonly used first- and second-order recurrent neural networks. Unlike previous work that either imposed some restrictions to state values or used a detailed analysis based on fixed-point attractors, our approach applies to any positive, bounded, strictly growing, continuous activation function and uses simple bounding criteria based on a study of the conditions under which a proposed encoding scheme guarantees that the DTRNN is actually behaving as a finite-state machine.

[1]  Ah Chung Tsoi,et al.  Discrete time recurrent neural network architectures: A unifying review , 1997, Neurocomputing.

[2]  Kazutoshi Gohara,et al.  Fractal Transition in continuous Recurrent Neural Networks , 2001, Int. J. Bifurc. Chaos.

[3]  Mikel L. Forcada,et al.  Constrained Second-Order Recurrent Networks for Finite-State Automata Induction , 1998 .

[4]  Mikel L. Forcada,et al.  Simple Strategies to Encode Tree Automata in Sigmoid Recursive Neural Networks , 2001, IEEE Trans. Knowl. Data Eng..

[5]  Eduardo D. Sontag,et al.  Analog Neural Nets with Gaussian or Other Common Noise Distributions Cannot Recognize Arbitrary Regular Languages , 1999, Neural Computation.

[6]  Stefan C. Kremer,et al.  On the computational power of Elman-style recurrent networks , 1995, IEEE Trans. Neural Networks.

[7]  J. R,et al.  Analog Stable Simulation of Discrete Neural Networks , 1997 .

[8]  Alberto Sanfeliu,et al.  An Algebraic Framework to Represent Finite State Machines in Single-Layer Recurrent Neural Networks , 1995, Neural Computation.

[9]  Marvin Minsky,et al.  Computation : finite and infinite machines , 2016 .

[10]  Rafael C. Carrasco,et al.  Efficient encoding of finite automata in discrete-time recurrent neural networks , 1999 .

[11]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[12]  Mikel L. Forcada,et al.  Inferring stochastic regular grammars with recurrent neural networks , 1996, ICGI.

[13]  Peter Tiňo,et al.  Finite State Machines and Recurrent Neural Networks -- Automata and Dynamical Systems Approaches , 1995 .

[14]  D.R. Hush,et al.  Progress in supervised neural networks , 1993, IEEE Signal Processing Magazine.

[15]  C. Lee Giles,et al.  Experimental Comparison of the Effect of Order in Recurrent Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..

[16]  M. Goudreau,et al.  First-order vs. Second-order Single Layer Recurrent Neural Networks , 1994 .

[17]  C. Lee Giles,et al.  Constructing deterministic finite-state automata in recurrent neural networks , 1996, JACM.

[18]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[19]  John F. Kolen,et al.  The observers' paradox: apparent computational complexity in physical systems , 1995, J. Exp. Theor. Artif. Intell..

[20]  Mikel L. Forcada,et al.  Encoding of sequential translators in discrete-time recurrent neural nets , 1999, ESANN.

[21]  Frank Fallside,et al.  A recurrent error propagation network speech recognition system , 1991 .

[22]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[23]  Ryohei Nakano,et al.  Adaptive β Scheduling Learning Method of Finite State Automata by Recurrent Neural Networks , 1997, ICONIP.

[24]  Renée Elio,et al.  A theory of grammatical induction in the connectionist paradigm , 1996 .

[25]  Padhraic Smyth,et al.  Learning Finite State Machines With Self-Clustering Recurrent Networks , 1993, Neural Computation.

[26]  Marco Gori,et al.  Recurrent neural networks can learn simple, approximate regular languages , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[27]  Andrew S. Noetzel,et al.  Forcing Simple Recurrent Neural Networks to Encode Context , 1992 .

[28]  Panagiotis Manolios,et al.  First-Order Recurrent Neural Networks and Deterministic Finite State Automata , 1994, Neural Computation.

[29]  Ryohei Nakano,et al.  Annealed RNN Learning of Finite State Automata , 1996, ICANN.

[30]  James L. McClelland,et al.  Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[31]  John F. Kolen,et al.  Fool's Gold: Extracting Finite State Machines from Recurrent Network Dynamics , 1993, NIPS.

[32]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[33]  Srimat T. Chakradhar,et al.  First-order versus second-order single-layer recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[34]  Jordan B. Pollack,et al.  Analysis of Dynamical Recognizers , 1997, Neural Computation.

[35]  Thomas L. Floyd Digital Fundamentals , 1986 .

[36]  Efficient encodings of finite automata in discrete-time recurrent neural networks ∗ , 1999 .

[37]  Noga Alon,et al.  Efficient simulation of finite automata by neural nets , 1991, JACM.

[38]  Michael C. Mozer,et al.  Dynamic On-line Clustering and State Extraction: An Approach to Symbolic Learning , 1998, Neural Networks.

[39]  Peter Tiño,et al.  Learning and Extracting Initial Mealy Automata with a Modular Neural Network Model , 1995, Neural Comput..

[40]  C. Lee Giles,et al.  An experimental comparison of recurrent neural networks , 1994, NIPS.

[41]  Giovanni Soda,et al.  Inductive inference from noisy examples using the hybrid finite state filter , 1998, IEEE Trans. Neural Networks.

[42]  Jirí Wiedermann,et al.  Theory of neuromata , 1998, JACM.

[43]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[44]  Piotr Indyk Optimal Simulation of Automata by Neural Nets , 1995, STACS.

[45]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[46]  Don R. Hush,et al.  Bounds on the complexity of recurrent neural network implementations of finite state machines , 1993, Neural Networks.

[47]  Mike Casey,et al.  The Dynamics of Discrete-Time Computation, with Application to Recurrent Neural Networks and Finite State Machine Extraction , 1996, Neural Computation.

[48]  Pekka Orponen,et al.  On the Effect of Analog Noise in Discrete-Time Analog Computations , 1996, Neural Computation.

[49]  C. Lee Giles,et al.  Stable Encoding of Large Finite-State Automata in Recurrent Neural Networks with Sigmoid Discriminants , 1996, Neural Computation.

[50]  Mikel L. Forcada,et al.  Learning the Initial State of a Second-Order Recurrent Neural Network during Regular-Language Inference , 1995, Neural Computation.

[51]  Raymond L. Watrous,et al.  Induction of Finite-State Languages Using Second-Order Recurrent Networks , 1992, Neural Computation.

[52]  Michael Casey Correction to Proof That Recurrent Neural Networks Can Robustly Recognize Only Regular Languages , 1998, Neural Computation.

[53]  Alberto Sanfeliu,et al.  Active Grammatical Inference: A New Learning Methodology , 1994 .

[54]  C. Lee Giles,et al.  Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks , 1992, Neural Computation.

[55]  Padhraic Smyth,et al.  Discrete recurrent neural networks for grammatical inference , 1994, IEEE Trans. Neural Networks.

[56]  Ryohei Nakano,et al.  Stable behavior in a recurrent neural network for a finite state machine , 2000, Neural Networks.

[57]  L. K. Li,et al.  Fixed point analysis for discrete-time recurrent neural networks , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[58]  S C Kleene,et al.  Representation of Events in Nerve Nets and Finite Automata , 1951 .