Incremental Construction and Maintenance of Minimal Finite-State Automata

Daciuk et al. [Computational Linguistics 26(1):316 (2000)] describe a method for constructing incrementally minimal, deterministic, acyclic finite-state automata (dictionaries) from sets of strings. But acyclic finite-state automata have limitations: For instance, if one wants a linguistic application to accept all possible integer numbers or Internet addresses, the corresponding finite-state automaton has to be cyclic. In this article, we describe a simple and equally efficient method for modifying any minimal finite-state automaton (be it acyclic or not) so that a string is added to or removed from the language it accepts; both operations are very important when dictionary maintenance is performed and solve the dictionary construction problem addressed by Daciuk et al. as a special case. The algorithms proposed here may be straightforwardly derived from the customary textbook constructions for the intersection and the complementation of finite-state automata; the algorithms exploit the special properties of the automata resulting from the intersection operation when one of the finite-state automata accepts a single string.