Obtaining shorter regular expressions from finite-state automata

We consider the use of state elimination to construct shorter regular expressions from finite-state automata (FAs). Although state elimination is an intuitive method for computing regular expressions from FAs, the resulting regular expressions are often very long and complicated. We examine the minimization of FAs to obtain shorter expressions first. Then, we introduce vertical chopping based on bridge states and horizontal chopping based on the structural properties of given FAs. We prove that we should not eliminate bridge states until we eliminate all non-bridge states to obtain shorter regular expressions. In addition, we suggest heuristics for state elimination that leads to shorter regular expressions based on vertical chopping and horizontal chopping.

[1]  Edward J. McCluskey,et al.  Signal Flow Graph Techniques for Sequential Circuit State Diagrams , 1963, IEEE Trans. Electron. Comput..

[2]  Bell Telephone,et al.  Regular Expression Search Algorithm , 1968 .

[3]  Derick Wood,et al.  A characterization of Thompson digraphs , 2004, Discret. Appl. Math..

[4]  Dora Giammarresi,et al.  Deterministic Generalized Automata , 1995, Theor. Comput. Sci..

[5]  Derick Wood,et al.  The validation of SGML content models , 1997 .

[6]  S C Kleene,et al.  Representation of Events in Nerve Nets and Finite Automata , 1951 .

[7]  Tao Jiang,et al.  Minimal NFA Problems are Hard , 1991, SIAM J. Comput..

[8]  Manuel Delgado,et al.  Approximation to the Smallest Regular Expression for a Given Regular Language , 2004, CIAA.

[9]  Derick Wood,et al.  Theory of computation , 1986 .

[10]  Georg Schnitger,et al.  Minimizing nfa's and regular expressions , 2007, J. Comput. Syst. Sci..

[11]  Samuel Eilenberg,et al.  Automata, languages, and machines. A , 1974, Pure and applied mathematics.

[12]  Berndt Farwer,et al.  ω-automata , 2002 .

[13]  Pascal Caron,et al.  Characterization of Glushkov automata , 2000, Theor. Comput. Sci..

[14]  V. Glushkov THE ABSTRACT THEORY OF AUTOMATA , 1961 .

[15]  John E. Hopcroft,et al.  An n log n algorithm for minimizing states in a finite automaton , 1971 .

[16]  Lucian Ilie,et al.  On NFA Reductions , 2004, Theory Is Forever.

[17]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[18]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[19]  Arto Salomaa,et al.  Factorizations of Languages and Commutativity Conditions , 2002, Acta Cybern..

[20]  Lucian Ilie,et al.  Follow automata , 2003, Inf. Comput..

[21]  Derick Wood,et al.  The generalization of generalized automata: expression automata , 2005, Int. J. Found. Comput. Sci..

[22]  Robert McNaughton,et al.  Regular Expressions and State Graphs for Automata , 1960, IRE Trans. Electron. Comput..

[23]  Ken Thompson,et al.  Programming Techniques: Regular expression search algorithm , 1968, Commun. ACM.