Computing All Distinct Squares in Linear Time for Integer Alphabets

Given a string on an integer alphabet, we present an algorithm that computes the set of all distinct squares belonging to this string in time linear to the string length. As an application, we show how to compute the tree topology of the minimal augmented suffix tree in linear time. Asides from that, we elaborate an algorithm computing the longest previous table in a succinct representation using compressed working space.

[1]  Wing-Kai Hon,et al.  Space-Economical Algorithms for Finding Maximal Unique Matches , 2002, CPM.

[2]  Jens Stoye,et al.  Linear time algorithms for finding and representing all the tandem repeats in a string , 2004, J. Comput. Syst. Sci..

[3]  Diptarama,et al.  Longest Common Subsequence in at Least k Length Order-isomorphic Substrings , 2017, SOFSEM.

[4]  Faith Ellen,et al.  Optimal Bounds for the Predecessor Problem and Related Problems , 2002, J. Comput. Syst. Sci..

[5]  Shinnosuke Seki,et al.  Square-Density Increasing Mappings , 2015, WORDS.

[6]  Frantisek Franek,et al.  How many double squares can a string contain? , 2015, Discret. Appl. Math..

[7]  Shinnosuke Seki,et al.  A Stronger Square Conjecture on Binary Words , 2014, SOFSEM.

[8]  Johannes Fischer,et al.  Lempel Ziv Computation in Small Space (LZ-CISS) , 2015, CPM.

[9]  Juha Kärkkäinen,et al.  Linear Time Lempel-Ziv Factorization: Simple, Fast, Small , 2012, CPM.

[10]  Johannes Fischer,et al.  Inducing the LCP-Array , 2011, WADS.

[11]  Keisuke Goto Optimal Time and Space Construction of Suffix Arrays and LCP Arrays for Integer Alphabets , 2019, Stringology.

[12]  Gerhard J. Woeginger,et al.  Automata, Languages and Programming , 2003, Lecture Notes in Computer Science.

[13]  S. Muthukrishnan,et al.  On the sorting-complexity of suffix tree construction , 2000, JACM.

[14]  David Richard Clark,et al.  Compact pat trees , 1998 .

[15]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.

[16]  Peter Elias,et al.  Efficient Storage and Retrieval by Content and Address of Static Files , 1974, JACM.

[17]  Lucian Ilie,et al.  Computing Longest Previous Factor in linear time and applications , 2008, Inf. Process. Lett..

[18]  Kunihiko Sadakane,et al.  Compressed Suffix Trees with Full Functionality , 2007, Theory of Computing Systems.

[19]  Enno Ohlebusch,et al.  Lempel-Ziv Factorization Revisited , 2011, CPM.

[20]  Johannes Fischer,et al.  Wee LCP , 2009, Inf. Process. Lett..

[21]  Franco P. Preparata,et al.  Data structures and algorithms for the string statistics problem , 1996, Algorithmica.

[22]  Anna Pagh,et al.  Solving the String Statistics Problem in Time O(n log n) , 2002, ICALP.

[23]  Kunihiko Sadakane,et al.  Lempel-Ziv Computation in Compressed Space (LZ-CICS) , 2015, 2016 Data Compression Conference (DCC).

[24]  Jian Li,et al.  Optimal In-Place Suffix Sorting , 2016, 2018 Data Compression Conference.

[25]  Aviezri S. Fraenkel,et al.  How Many Squares Can a String Contain? , 1998, J. Comb. Theory, Ser. A.

[26]  David Haussler,et al.  The Smallest Automaton Recognizing the Subwords of a Text , 1985, Theor. Comput. Sci..

[27]  Rajeev Raman,et al.  Succinct indexable dictionaries with applications to encoding k-ary trees and multisets , 2002, SODA '02.

[28]  Juha Kärkkäinen,et al.  Permuted Longest-Common-Prefix Array , 2009, CPM.

[29]  Volker Heun,et al.  Space-Efficient Preprocessing Schemes for Range Minimum Queries on Static Arrays , 2011, SIAM J. Comput..

[30]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[31]  Frantisek Franek,et al.  Computing Quasi Suffix Arrays , 2003, J. Autom. Lang. Comb..

[32]  Lucas Chi Kwong Hui,et al.  Color Set Size Problem with Application to String Matching , 1992, CPM.

[33]  Gonzalo Navarro,et al.  Space-Efficient Construction of Compressed Indexes in Deterministic Linear Time , 2016, SODA.

[34]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[35]  Peter Weiner,et al.  Linear Pattern Matching Algorithms , 1973, SWAT.

[36]  Esko Ukkonen,et al.  On-line construction of suffix trees , 1995, Algorithmica.

[37]  Lucian Ilie,et al.  A note on the number of squares in a word , 2007, Theor. Comput. Sci..

[38]  Richard Cole,et al.  Dynamic LCA queries on trees , 1999, SODA '99.

[39]  Robert E. Tarjan,et al.  A linear-time algorithm for a special case of disjoint set union , 1983, J. Comput. Syst. Sci..

[40]  Alistair Moffat,et al.  From Theory to Practice: Plug and Play with Succinct Data Structures , 2013, SEA.

[41]  Wojciech Rytter,et al.  LPF Computation Revisited , 2009, IWOCA.

[42]  Gianni Franceschini,et al.  In-Place Suffix Sorting , 2007, ICALP.