Most Recent Match Queries in On-Line Suffix Trees

A suffix tree is able to efficiently locate a pattern in an indexed string, but not in general the most recent copy of the pattern in an online stream, which is desirable in some applications. We study the most general version of the problem of locating a most recent match: supporting queries for arbitrary patterns, at each step of processing an online stream. We present augmentations to Ukkonen's suffix tree construction algorithm for optimal-time queries, maintaining indexing time within a logarithmic factor in the size of the indexed string. We show that the algorithm is applicable to sliding-window indexing, and sketch a possible optimization for use in the special case of Lempel-Ziv compression.

[1]  Giuseppe F. Italiano,et al.  On Suffix Extensions in Suffix Trees , 2011, SPIRE.

[2]  Peter Weiner,et al.  Linear Pattern Matching Algorithms , 1973, SWAT.

[3]  N. Jesper Larsson Extended application of suffix trees to data compression , 1996, Proceedings of Data Compression Conference - DCC '96.

[4]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.

[5]  M. Crochemore,et al.  On-line construction of suffix trees , 2002 .

[6]  Filippo Mignosi,et al.  The Rightmost Equal-Cost Position Problem , 2013, 2013 Data Compression Conference.

[7]  Gad M. Landau,et al.  Online timestamped text indexing , 2002, Inf. Process. Lett..

[8]  M. Farach Optimal suffix tree construction with large alphabets , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[9]  Richard Cole,et al.  Dynamic LCA queries on trees , 1999, SODA '99.

[10]  김동규,et al.  [서평]「Algorithms on Strings, Trees, and Sequences」 , 2000 .

[11]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[12]  Jeffery R. Westbrook Fast Incremental Planarity Testing , 1992, ICALP.

[13]  Edward R. Fiala,et al.  Data compression with finite windows , 1989, CACM.

[14]  Z. Galil,et al.  Combinatorial Algorithms on Words , 1985 .

[15]  Alberto Apostolico,et al.  The Myriad Virtues of Subword Trees , 1985 .

[16]  N. Jesper Larsson Structures of String Matching and Data Compression , 1999 .

[17]  Paul F. Dietz,et al.  Two algorithms for maintaining order in a list , 1987, STOC.

[18]  Edward M. McCreight,et al.  A Space-Economical Suffix Tree Construction Algorithm , 1976, JACM.

[19]  William F. Smyth,et al.  A taxonomy of suffix array construction algorithms , 2007, CSUR.

[20]  Robin Milner,et al.  On Observing Nondeterminism and Concurrency , 1980, ICALP.

[21]  Paolo Ferragina,et al.  On the Bit-Complexity of Lempel-Ziv Compression , 2009, SIAM J. Comput..

[22]  N. Jesper Larsson,et al.  Efficient Representation for Online Suffix Tree Construction , 2014, SEA.