Sparse Suffix Trees

A sparse suffix tree is a suffix tree that represents only a subset of the suffixes of the text. This is in contrast to the standard suffix tree that represents all suffixes. By selecting a small enough subset, a sparse suffix tree can be made to fit the available storage, unfortunately at the cost of increased search times. The idea of sparse suffix trees goes back to PATRICIA tries. Evenly spaced sparse suffix trees represent every kth suffix of the text. In the paper, we give general construction and search algorithms for evenly spaced sparse suffix trees, and present their run time analysis, both in the worst and in the average case. The algorithms are further improved by using so-called dual suffix trees.

[1]  Donald R. Morrison,et al.  PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric , 1968, J. ACM.

[2]  Peter Weiner,et al.  Linear Pattern Matching Algorithms , 1973, SWAT.

[3]  Xerox Polo,et al.  A Space-Economical Suffix Tree Construction Algorithm , 1976 .

[4]  Edward M. McCreight,et al.  A Space-Economical Suffix Tree Construction Algorithm , 1976, JACM.

[5]  Donald E. Knuth,et al.  Fast Pattern Matching in Strings , 1977, SIAM J. Comput..

[6]  Alberto Apostolico,et al.  The Myriad Virtues of Subword Trees , 1985 .

[7]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.

[8]  Gaston H. Gonnet,et al.  Lexicographical Indices for Text: Inverted files vs. PAT trees , 1991 .

[9]  Wojciech Szpankowski,et al.  Self-Alignments in Words and Their Applications , 1992, J. Algorithms.

[10]  Arne Andersson,et al.  Improved Behaviour of Tries by Adaptive Branching , 1993, Inf. Process. Lett..

[11]  Juha Kärkkäinen Suffix Cactus: A Cross between Suffix Tree and Suffix Array , 1995, CPM.

[12]  S. Rao Kosaraju,et al.  Large-scale assembly of DNA strings and space-efficient construction of suffix trees , 1995, STOC '95.

[13]  Arne Andersson,et al.  Efficient implementation of suffix trees , 1995, Softw. Pract. Exp..

[14]  Large-scale assembly of DNA strings and space-efficient construction of suffix trees , 1996, STOC '96.

[15]  Arne Andersson,et al.  Suffix Trees on Words , 1996, CPM.