Burrows-Wheeler Transform of Words Defined by Morphisms

The Burrows-Wheeler transform (BWT) is a popular method used for text compression. It was proved that BWT has optimal performance on standard words, i.e. the building blocks of Sturmian words. In this paper, we study the application of BWT on more general morphic words: the Thue-Morse word and to generalizations of the Fibonacci word to alphabets with more than two letters; then, we study morphisms obtained as composition of the Thue-Morse morphism with a Sturmian one. In all these cases, the BWT efficiently clusters the iterates of the morphisms generating prefixes of these infinite words, for which we determine the compression clustering ratio.

[1]  Ira M. Gessel,et al.  Counting Permutations with Given Cycle Structure and Descent Set , 1993, J. Comb. Theory A.

[2]  M. Lothaire Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications) , 2005 .

[3]  Maxime Crochemore,et al.  A note on the Burrows - CWheeler transformation , 2005, Theor. Comput. Sci..

[4]  Elena Barcucci,et al.  ON TRIBONACCI SEQUENCES , 2002 .

[5]  Haim Kaplan,et al.  A simpler analysis of Burrows-Wheeler-based compression , 2007, Theor. Comput. Sci..

[6]  Antonio Restivo,et al.  Burrows-Wheeler transform and Sturmian words , 2003, Inf. Process. Lett..

[7]  Antonio Restivo,et al.  Balancing and clustering of words in the Burrows-Wheeler transform , 2011, Theor. Comput. Sci..

[8]  Giovanni Manzini,et al.  An analysis of the Burrows-Wheeler transform , 2001, SODA '99.

[9]  Antonio Restivo,et al.  Burrows-Wheeler Transform and Run-Length Enconding , 2017, WORDS.

[10]  Antonio Restivo,et al.  Measuring the clustering effect of BWT via RLE , 2017, Theor. Comput. Sci..

[11]  Raffaele Giancarlo,et al.  Boosting textual compression in optimal linear time , 2005, JACM.

[12]  Amar Mukherjee,et al.  The Burrows-Wheeler Transform:: Data Compression, Suffix Arrays, and Pattern Matching , 2008 .

[13]  M. Lothaire,et al.  Algebraic Combinatorics on Words: Index of Notation , 2002 .

[14]  D. J. Wheeler,et al.  A Block-sorting Lossless Data Compression Algorithm , 1994 .

[15]  Antonio Restivo,et al.  Balanced Words Having Simple Burrows-Wheeler Transform , 2009, Developments in Language Theory.

[16]  Antonio Restivo,et al.  Burrows-Wheeler transform and palindromic richness , 2009, Theor. Comput. Sci..

[17]  Simon J. Puglisi,et al.  Words with Simple Burrows-Wheeler Transforms , 2008, Electron. J. Comb..

[18]  Zhi-Ying Wen,et al.  Some properties of the Tribonacci sequence , 2007, Eur. J. Comb..

[19]  Symbolic dynamics , 2008, Scholarpedia.