From Regular Expressions to DFA's Using Compressed NFA's

We show how to turn a regular expression R of length r into an O(s) space representation of McNaughton and Yamada's NFA, where s is the number of occurrences of alphabet symbols in R, and s+1 is the number of NFA states. The standard adjacency list representation of McNaughton and Yamada's NFA takes up s + s2 space in the worst case. The adjacency list representation of the NFA produced by Thompson takes up between 2r and 6r space, where r can be arbitrarily larger than s. Given any set V of NFA states, our representation can be used to compute the set U of states one transition away from the states in V in optimal time O(¦V¦+¦U¦). McNaughton and Yamada's NFA requires Θ(¦V¦ × ¦U¦) time in the worst case. Using Thompson's NFA, the equivalent calculation requires Θ(r) time in the worst case.

[1]  Jeffrey D Ullma Computational Aspects of VLSI , 1984 .

[2]  Ken Thompson,et al.  Programming Techniques: Regular expression search algorithm , 1968, Commun. ACM.

[3]  Janusz A. Brzozowski,et al.  Derivatives of Regular Expressions , 1964, JACM.

[4]  Anne Brüggemann-Klein Regular Expressions into Finite Automata , 1993, Theor. Comput. Sci..

[5]  Robert McNaughton,et al.  Regular Expressions and State Graphs for Automata , 1960, IRE Trans. Electron. Comput..

[6]  Jeffrey D. Ullman,et al.  Formal languages and their relation to automata , 1969, Addison-Wesley series in computer science and information processing.

[7]  Robert E. Tarjan,et al.  Making data structures persistent , 1986, STOC '86.

[8]  Gérard Berry,et al.  The ESTEREL Synchronous Programming Language and its Mathematical Semantics , 1984, Seminar on Concurrency.

[9]  Donald E. Knuth,et al.  On the Translation of Languages from Left to Right , 1965, Inf. Control..

[10]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[11]  A. Nerode,et al.  Linear automaton transformations , 1958 .

[12]  Gérard Berry,et al.  From Regular Expressions to Deterministic Automata , 1986, Theor. Comput. Sci..

[13]  Robert Paige,et al.  Look ma, no hashing, and no arrays neither , 1991, POPL '91.

[14]  Alfred V. Aho,et al.  Pattern Matching in Strings , 1980 .

[15]  S C Kleene,et al.  Representation of Events in Nerve Nets and Finite Automata , 1951 .

[16]  Douglas R. Smith,et al.  KIDS - A Knowledge-Based Software Development System , 1991 .

[17]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[18]  Ken Thompson,et al.  The UNIX time-sharing system , 1974, CACM.

[19]  Chin-Laung Lei,et al.  Efficient Model Checking in Fragments of the Propositional Mu-Calculus (Extended Abstract) , 1986, LICS.

[20]  Dana S. Scott,et al.  Finite Automata and Their Decision Problems , 1959, IBM J. Res. Dev..