Inferring strings from suffix trees and links on a binary alphabet

A suffix tree, which provides us with a linear space full-text index of a given string, is a fundamental data structure for string processing and information retrieval. In this paper we consider the reverse engineering problem on suffix trees: given an unlabeled ordered rooted tree T accompanied with a node-to-node transition function f, infer a string whose suffix tree and its suffix links for inner nodes are isomorphic to T and f, respectively. Also, we consider the enumeration problem in which we enumerate all strings corresponding to an input tree and links. By introducing new characterizations of suffix trees, we show that the reverse engineering problem and the enumeration problem on suffix trees on a binary alphabet can be solved in optimal time.

[1]  William F. Smyth,et al.  Counting Distinct Strings , 1999, Algorithmica.

[2]  Maxime Crochemore,et al.  Reverse Engineering Prefix Tables , 2009, STACS.

[3]  Edward M. McCreight,et al.  A Space-Economical Suffix Tree Construction Algorithm , 1976, JACM.

[4]  Arnaud Lefebvre,et al.  Words over an ordered alphabet and suffix permutations , 2002, RAIRO Theor. Informatics Appl..

[5]  Takeaki Uno An Algorithm for Enumerating all Directed Spanning Trees in a Directed Graph , 1996, ISAAC.

[6]  Artur Jez,et al.  Validating the Knuth-Morris-Pratt Failure Function, Fast and Online , 2010, Theory of Computing Systems.

[7]  Hideo Bannai,et al.  Counting Parameterized Border Arrays for a Binary Alphabet , 2009, LATA.

[8]  W. F. Smyth,et al.  Verifying a border array in linear time , 1999 .

[9]  Hideo Bannai,et al.  Verifying a Parameterized Border Array in O(n1.5) Time , 2010, CPM.

[10]  Xerox Polo,et al.  A Space-Economical Suffix Tree Construction Algorithm , 1976 .

[11]  Arnaud Lefebvre,et al.  Efficient validation and construction of border arrays and validation of string matching automata , 2009, RAIRO Theor. Informatics Appl..

[12]  Alberto Apostolico,et al.  The Myriad Virtues of Subword Trees , 1985 .

[13]  Maxime Crochemore,et al.  Cover Array String Reconstruction , 2010, CPM.

[14]  Jens Stoye,et al.  Counting suffix arrays and strings , 2005, Theor. Comput. Sci..

[15]  Hideo Bannai,et al.  Counting and Verifying Maximal Palindromes , 2010, SPIRE.

[16]  de Ng Dick Bruijn,et al.  Circuits and Trees in Oriented Linear Graphs , 1951 .

[17]  Ayumi Shinohara,et al.  Inferring Strings from Runs , 2010, Stringology.

[18]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[19]  Sanjiv Kapoor,et al.  Algorithms for Generating All Spanning Trees of Undirected, Directed and Weighted Graphs , 1991, WADS.

[20]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[21]  Peter Weiner,et al.  Linear Pattern Matching Algorithms , 1973, SWAT.

[22]  Arnaud Lefebvre,et al.  Border Array on Bounded Alphabet , 2002, Stringology.

[23]  Esko Ukkonen,et al.  On-line construction of suffix trees , 1995, Algorithmica.

[24]  Ayumi Shinohara,et al.  Inferring Strings from Graphs and Arrays , 2003, MFCS.