Computing Smallest and Largest Repetition Factorizations in O(n log n) Time

A factorization f1, . . . , fm of a string w is called a repetition factorization of w if each factor fi is a repetition, namely, fi = x x for some non-empty string x, an integer k ≥ 2, and x being a proper prefix of x. Dumitran et al. (Proc. SPIRE 2015) proposed an algorithm which computes a repetition factorization of a given string w in O(n) time, where n is the length of w. In this paper, we propose two algorithms which compute smallest/largest repetition factorizations in O(n log n) time. The first algorithm is a simple O(n log n) space algorithm while the second one uses only O(n) space.

[1]  James A. Storer,et al.  Data compression via textual substitution , 1982, JACM.

[2]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[3]  Hideo Bannai,et al.  Computing Palindromic Factorizations and Palindromic Covers On-line , 2014, CPM.

[4]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[5]  Jamie Simpson,et al.  The Exact Number of Squares in Fibonacci Words , 1999, Theor. Comput. Sci..

[6]  Wojciech Rytter,et al.  Squares, cubes, and time-space efficient string searching , 1995, Algorithmica.

[7]  R. Lyndon,et al.  Free Differential Calculus, IV. The Quotient Groups of the Lower Central Series , 1958 .

[8]  Terry A. Welch,et al.  A Technique for High-Performance Data Compression , 1984, Computer.

[9]  H. Wilf,et al.  Uniqueness theorems for periodic functions , 1965 .

[10]  Juha Kärkkäinen,et al.  A subquadratic algorithm for minimum palindromic factorization , 2014, J. Discrete Algorithms.

[11]  Lucian Ilie,et al.  Computing Longest Previous Factor in linear time and applications , 2008, Inf. Process. Lett..

[12]  Hideo Bannai,et al.  Diverse Palindromic Factorization Is NP-complete , 2015, DLT.

[13]  Manfred Kufleitner On Bijective Variants of the Burrows-Wheeler Transform , 2009, Stringology.

[14]  Hideo Bannai,et al.  Constructing LZ78 tries and position heaps in linear time for large alphabets , 2015, Inf. Process. Lett..

[15]  Gregory Kucherov,et al.  Finding maximal repetitions in a word in linear time , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[16]  Mikhail Posypkin,et al.  Searching of gapped repeats and subrepetitions in a word , 2017, J. Discrete Algorithms.

[17]  Lucian Ilie,et al.  A Simple Algorithm for Computing the Lempel Ziv Factorization , 2008, Data Compression Conference (dcc 2008).

[18]  Hideo Bannai,et al.  Factorizing a String into Squares in Linear Time , 2016, CPM.

[19]  Jean Pierre Duval,et al.  Factorizing Words over an Ordered Alphabet , 1983, J. Algorithms.