Simple Strategies to Encode Tree Automata in Sigmoid Recursive Neural Networks

Recently, a number of authors have explored the use of recursive neural nets (RNN) for the adaptive processing of trees or tree-like structures. One of the most important language-theoretical formalizations of the processing of tree-structured data is that of deterministic finite-state tree automata (DFSTA). DFSTA may easily be realized as RNN using discrete-state units, such as the threshold linear unit. A recent result by J. Sima (1997) shows that any threshold linear unit operating on binary inputs can be implemented in an analog unit using a continuous activation function and bounded real inputs. The constructive proof finds a scaling factor for the weights and reestimates the bias accordingly. We explore the application of this result to simulate DFSTA in sigmoid RNN (that is, analog RNN using monotonically growing activation functions) and also present an alternative scheme for one-hot encoding of the input that yields smaller weight values, and therefore works at a lower saturation level.

[1]  Mikel L. Forcada,et al.  Learning the Initial State of a Second-Order Recurrent Neural Network during Regular-Language Inference , 1995, Neural Computation.

[2]  Alfred V. Aho,et al.  Currents In The Theory Of Computing , 1973 .

[3]  Stefan C. Kremer,et al.  On the computational power of Elman-style recurrent networks , 1995, IEEE Trans. Neural Networks.

[4]  J. R,et al.  Analog Stable Simulation of Discrete Neural Networks , 1997 .

[5]  Alberto Sanfeliu,et al.  An Algebraic Framework to Represent Finite State Machines in Single-Layer Recurrent Neural Networks , 1995, Neural Computation.

[6]  Peter C. Chapin Formal languages I , 1973, CSC '73.

[7]  james w.thatcher,et al.  tree automata: an informal survey , 1974 .

[8]  M. W. Shields An Introduction to Automata Theory , 1988 .

[9]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[10]  Mikel L. Forcada,et al.  Encoding of sequential translators in discrete-time recurrent neural nets , 1999, ESANN.

[11]  C. Lee Giles,et al.  Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks , 1992, Neural Computation.

[12]  C. Lee Giles,et al.  Stable Encoding of Large Finite-State Automata in Recurrent Neural Networks with Sigmoid Discriminants , 1996, Neural Computation.

[13]  Alessandro Sperduti,et al.  On the Computational Power of Recurrent Neural Networks for Structures , 1997, Neural Networks.

[14]  Alessandro Sperduti,et al.  A general framework for adaptive processing of data structures , 1998, IEEE Trans. Neural Networks.

[15]  Mikel L. Forcada,et al.  Stable Encoding of Finite-State Machines in Discrete-Time Recurrent Neural Nets with Sigmoid Units , 2000, Neural Computation.

[16]  Mikel L. Forcada,et al.  Constrained Second-Order Recurrent Networks for Finite-State Automata Induction , 1998 .

[17]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[18]  C. Lee Giles,et al.  Constructing deterministic finite-state automata in recurrent neural networks , 1996, JACM.

[19]  Srimat T. Chakradhar,et al.  First-order versus second-order single-layer recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[20]  Alessandro Sperduti,et al.  Supervised neural networks for the classification of structures , 1997, IEEE Trans. Neural Networks.