Diversity-Based Inference of Finite Automata (Extended Abstract)

We present a new procedure for inferring the structure of a finitestate automaton (FSA) from its input/output behavior, using access to the automaton to perform experiments. Our procedure uses a new representation for FSA's, based on the notion of equivalence between testa. We call the number of such equivalence classes the diversity of the automaton; the diversity may be as small as the logarithm of the number of states of the automaton. The size of our representation of the FSA, and the running time of our procedure (in some case provably, in others conjecturally) is polynomial in the diversity and ln(1/e), where e is a given upper bound on the probability that our procedure returns an incorrect result. (Since our procedure uses randomization to perform experiments, there is a certain controllable chance that it will return an erroneous result.) We also present some evidence for the practical efficiency of our approach. For example, our procedure is able to infer the structure of an automaton based on Rubik's Cube (which has approximately 1019 states) in about 2 minutes on a DEC Micro Vax. This automaton is many orders of magnitude larger than possible with previous techniques, which would require time proportional at least to the number of global states. (Note that in this example, only a small fraction (10-14) of the global states were even visited.)