On repetitiveness measures of Thue-Morse words

We show that the size $\gamma(t_n)$ of the smallest string attractor of the $n$th Thue-Morse word $t_n$ is 4 for any $n\geq 4$, disproving the conjecture by Mantaci et al. [ICTCS 2019] that it is $n$. We also show that $\delta(t_n) = \frac{10}{3+2^{4-n}}$ for $n \geq 3$, where $\delta(w)$ is the maximum over all $k = 1,\ldots,|w|$, the number of distinct substrings of length $k$ in $w$ divided by $k$, which is a measure of repetitiveness recently studied by Kociumaka et al. [LATIN 2020]. Furthermore, we show that the number $z(t_n)$ of factors in the self-referencing Lempel-Ziv factorization of $t_n$ is exactly $2n$.

[1]  Juha Kärkkäinen,et al.  On the Size of Lempel-Ziv and Lyndon Factorizations , 2017, STACS.

[2]  Dominik Kempa,et al.  At the roots of dictionary compression: string attractors , 2017, STOC.

[3]  Abhi Shelat,et al.  The smallest grammar problem , 2005, IEEE Transactions on Information Theory.

[4]  D. J. Wheeler,et al.  A Block-sorting Lossless Data Compression Algorithm , 1994 .

[5]  Andrea Frosini,et al.  Burrows-Wheeler Transform of Words Defined by Morphisms , 2019, IWOCA.

[6]  Hideo Bannai,et al.  Faster Lyndon Factorization Algorithms for SLP and LZ78 Compressed Text , 2013, SPIRE.

[7]  James A. Storer,et al.  Data compression via textual substitution , 1982, JACM.

[8]  Ronitt Rubinfeld,et al.  Sublinear Algorithms for Approximating String Compressibility , 2007, Algorithmica.

[9]  Srecko Brlek,et al.  Enumeration of factors in the Thue-Morse word , 1989, Discret. Appl. Math..

[10]  Hideo Bannai,et al.  On the Size of Overlapping Lempel-Ziv and Lyndon Factorizations , 2019, CPM.

[11]  R. Lyndon,et al.  Free Differential Calculus, IV. The Quotient Groups of the Lower Central Series , 1958 .

[12]  Jean Berstel,et al.  Crochemore Factorization of Sturmian and Other Infinite Words , 2006, MFCS.

[13]  Guy Melançon,et al.  Lyndon factorization of the Thue-Morse word and its relatives , 1997, Discret. Math. Theor. Comput. Sci..

[14]  Antonio Restivo,et al.  String Attractors and Combinatorics on Words , 2019, ICTCS.

[15]  Antonio Restivo,et al.  Burrows-Wheeler transform and Sturmian words , 2003, Inf. Process. Lett..

[16]  Harold Marston Morse Recurrent geodesics on a surface of negative curvature , 1921 .

[17]  Tomasz Kociumaka,et al.  Towards a Definitive Measure of Repetitiveness , 2020, LATIN.

[18]  Gonzalo Navarro,et al.  Optimal-Time Dictionary-Compressed Indexes , 2018, ACM Trans. Algorithms.

[19]  Abraham Lempel,et al.  On the Complexity of Finite Sequences , 1976, IEEE Trans. Inf. Theory.