Efficient learning of typical finite automata from random walks

This paper describes new and efficient algorithms for learning deterministic finite automata. Our approach is primarily distinguished by two features: (1) the adoption of an average-case setting to model the ``typical'' labeling of a finite automaton, while retaining a worst-case model for the underlying graph of the automaton, along with (2) a learning model in which the learner is not provided with the means to experiment with the machine, but rather must learn solely by observing the automaton's output behavior on a random input sequence. The main contribution of this paper is in presenting the first efficient algorithms for learning nontrivial classes of automata in an entirely passive learning model. We adopt an on-line learning model in which the learner is asked to predict the output of the next state, given the next symbol of the random input sequence; the goal of the learner is to make as few prediction mistakes as possible. Assuming the learner has a means of resetting the target machine to a fixed start state, we first present an efficient algorithm that article no. IC972648

[1]  E. Mark Gold,et al.  System identification via state characterization , 1972 .

[2]  Ronald L. Rivest,et al.  Learning complicated concepts reliably and usefully , 1988, Annual Conference Computational Learning Theory.

[3]  Neri Merhav,et al.  Universal prediction of individual sequences , 1992, IEEE Trans. Inf. Theory.

[4]  Umesh V. Vazirani,et al.  Strong communication complexity or generating quasi-random sequences from two communicating semi-random sources , 1987, Comb..

[5]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[6]  KearnsMichael,et al.  Cryptographic limitations on learning Boolean formulae and finite automata , 1994 .

[7]  Boris A. Trakhtenbrot,et al.  Finite automata : behavior and synthesis , 1973 .

[8]  Avrim Blum,et al.  Some tools for approximate 3-coloring , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[9]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within any polynomial , 1993, JACM.

[10]  J. Gates Introduction to Probability and its Applications , 1992 .

[11]  Ronitt Rubinfeld,et al.  On the learnability of discrete distributions , 1994, STOC '94.

[12]  Oded Goldreich,et al.  Unbiased Bits from Sources of Weak Randomness and Probabilistic Communication Complexity , 1988, SIAM J. Comput..

[13]  DANA ANGLUIN,et al.  On the Complexity of Minimum Inference of Regular Sets , 1978, Inf. Control..

[14]  Leslie Pack Kaelbling,et al.  Inferring finite automata with stochastic output functions and an application to map learning , 1992, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[15]  Ronald L. Rivest,et al.  Diversity-based inference of finite automata , 1994, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[16]  Miklos Santha,et al.  Generating Quasi-random Sequences from Semi-random Sources , 1986, J. Comput. Syst. Sci..

[17]  David Haussler,et al.  Predicting {0,1}-functions on randomly drawn points , 1988, COLT '88.

[18]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[19]  Yossi Azar,et al.  Biased random walks , 1992, STOC '92.

[20]  Vijay V. Vazirani,et al.  Random polynomial time is equal to slightly-random polynomial time , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[21]  Leonard Pitt,et al.  Prediction-Preserving Reducibility , 1990, J. Comput. Syst. Sci..

[22]  Dana Ron,et al.  On the learnability and usage of acyclic probabilistic finite automata , 1995, COLT '95.

[23]  Kevin J. Lang Random DFA's can be approximately learned from sparse uniform examples , 1992, COLT '92.

[24]  Leslie G. Valiant,et al.  Computational limitations on learning from examples , 1988, JACM.

[25]  Ronald L. Rivest,et al.  Inference of finite automata using homing sequences , 1989, STOC '89.

[26]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[27]  N. Littlestone Learning Abound: Quickly When Irrelevant Attributes A New Linear-threshold Algorithm , 1988 .

[28]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[29]  Ya. M. Barzdin Deciphering of Sequential Networks in the Absence of an Upper Limit on the Number of States , 1970 .

[30]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[31]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within and polynomial , 1989, STOC '89.