Learning Deterministic Finite Automaton with a Recurrent Neural Network

We consider the problem of learning a finite automaton with recurrent neural networks from positive evidence. We train an Elman recurrent neural network with a set of sentences in a language and extract a finite automaton by clustering the states of the trained network. We observe that the generalizations beyond the training set, in the language recognized by the extracted automaton, are due to the training regime: the network performs a “loose” minimization of the prefix DFA of the training set, the automaton that has a state for each prefix of the sentences in the set.

[1]  Erkki Mäkinen,et al.  Inferring regular languages by merging nonterminals , 1999, Int. J. Comput. Math..

[2]  C. Lee Giles,et al.  Constructing deterministic finite-state automata in recurrent neural networks , 1996, JACM.

[3]  Paul J. Werbos,et al.  The roots of backpropagation , 1994 .

[4]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[5]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within any polynomial , 1993, JACM.

[6]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[7]  Paul R. Cohen,et al.  Empirical methods for artificial intelligence , 1995, IEEE Expert.

[8]  Geoffrey E. Hinton,et al.  Distributed Representations , 1986, The Philosophy of Artificial Intelligence.

[9]  KearnsMichael,et al.  Cryptographic limitations on learning Boolean formulae and finite automata , 1994 .

[10]  Yves Chauvin,et al.  Backpropagation: the basic theory , 1995 .

[11]  Mike Casey,et al.  The Dynamics of Discrete-Time Computation, with Application to Recurrent Neural Networks and Finite State Machine Extraction , 1996, Neural Computation.

[12]  C. Lee Giles,et al.  Extracting and Learning an Unknown Grammar with Recurrent Neural Networks , 1991, NIPS.

[13]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[14]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[15]  James L. McClelland,et al.  Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[16]  John F. Kolen,et al.  Fool's Gold: Extracting Finite State Machines from Recurrent Network Dynamics , 1993, NIPS.

[17]  J. Elman Distributed Representations, Simple Recurrent Networks, And Grammatical Structure , 1991 .