Extracting Powers and Periods in a String from Its Runs Structure

A breakthrough in the field of text algorithms was the discovery of the fact that the maximal number of runs in a string of length n is O(n) and that they can all be computed in O(n) time. We study some applications of this result. New simpler O(n) time algorithms are presented for a few classical string problems: computing all distinct kth string powers for a given k, in particular squares for k = 2, and finding all local periods in a given string of length n. Additionally, we present an efficient algorithm for testing primitivity of factors of a string and computing their primitive roots. Applications of runs, despite their importance, are underrepresented in existing literature (approximately one page in the paper of Kolpakov & Kucherov, 1999). In this paper we attempt to fill in this gap. We use Lyndon words and introduce the Lyndon structure of runs as a useful tool when computing powers. In problems related to periods we use some versions of the Manhattan skyline problem.

[1]  M. Crochemore,et al.  Algorithms on Strings: Tools , 2007 .

[2]  Lucian Ilie,et al.  A note on the number of squares in a word , 2007, Theor. Comput. Sci..

[3]  Bernard Chazelle,et al.  A Functional Approach to Data Structures and Its Use in Multidimensional Searching , 1988, SIAM J. Comput..

[4]  Jens Stoye,et al.  Linear time algorithms for finding and representing all the tandem repeats in a string , 2004, J. Comput. Syst. Sci..

[5]  Wojciech Rytter,et al.  On the Maximal Number of Cubic Subwords in a String , 2009, IWOCA.

[6]  Gang Chen,et al.  Fast and Practical Algorithms for Computing All the Runs in a String , 2007, CPM.

[7]  Maxime Crochemore,et al.  Algorithms on strings , 2007 .

[8]  Lucian Ilie,et al.  A simple proof that a word of length n has at most 2n distinct squares , 2005, J. Comb. Theory A.

[9]  Arnaud Lefebvre,et al.  Linear-Time Computation of Local Periods , 2003, MFCS.

[10]  Wojciech Rytter,et al.  Repetitions in strings: Algorithms and combinatorics , 2009, Theor. Comput. Sci..

[11]  Kunihiko Sadakane,et al.  Succinct data structures for flexible text retrieval systems , 2007, J. Discrete Algorithms.

[12]  Gregory Kucherov,et al.  On Maximal Repetitions in Words , 1999, FCT.

[13]  Aviezri S. Fraenkel,et al.  How Many Squares Can a String Contain? , 1998, J. Comb. Theory, Ser. A.

[14]  Volker Heun,et al.  A New Succinct Representation of RMQ-Information and Improvements in the Enhanced Suffix Array , 2007, ESCAPE.

[15]  Wojciech Rytter,et al.  Jewels of stringology , 2002 .

[16]  Robert E. Tarjan,et al.  A linear-time algorithm for a special case of disjoint set union , 1983, J. Comput. Syst. Sci..