Words and repeated factors.

In this paper we consider sets of factors of a finite word which permit us to reconstruct the entire word. This analysis is based on the notion of box. The initial (resp. terminal) box of w is the shortest prefix (resp. suffix) of w which is an unrepeated factor. A factor u of w is a proper box if there are letters a, a’, b, b’ with a’ ≠ a, b’ ≠ b such that u = asb and a’ s, sb’ are factors of w. A box is called maximal if it is not a proper factor of another box. The main result of the paper is the following theorem (maximal box theorem): Any finite word w is uniquely determined by the initial box, the terminal box and the set of maximal boxes. Another important combinatorial notion is that of superbox. A superbox is any factor of w of the kind asb, with a, b letters, such that s is a repeated factor, whereas as and sb are unrepeated factors. A theorem for superboxes similar to the maximal box theorem is proved. Some algorithms allowing us to construct boxes and superboxes and, conversely, to reconstruct the word are given. An extension of these results to languages is also presented.

[1]  J. Berstel,et al.  Theory of codes , 1985 .

[2]  Aldo de Luca,et al.  On the Combinatorics of Finite Words , 1999, Theor. Comput. Sci..

[3]  G. Paun,et al.  Jewels are Forever , 1999, Springer Berlin Heidelberg.

[4]  Aldo de Luca,et al.  Repetitions and Boxes in Words and Pictures , 1999, Jewels are Forever.

[5]  H. Wilf,et al.  Uniqueness theorems for periodic functions , 1965 .

[6]  Filippo Mignosi,et al.  Some Combinatorial Properties of Sturmian Words , 1994, Theor. Comput. Sci..

[7]  Aldo de Luca,et al.  On the Factors of the Thue-Morse Word on Three Symbols , 1988, Inf. Process. Lett..

[8]  Julien Cassaigne,et al.  Complexité et facteurs spéciaux , 1997 .

[9]  Antonio Restivo,et al.  Minimal Forbidden Words and Symbolic Dynamics , 1996, STACS.

[10]  Aldo de Luca,et al.  Words and special factors , 2001, Theor. Comput. Sci..

[11]  Aldo de Luca,et al.  On Bispecial Factors of the Thue-Morse Word , 1994, Inf. Process. Lett..

[12]  Aldo de Luca,et al.  Some Combinatorial Properties of the Thue-Morse Sequence and a Problem in Semigroups , 1989, Theor. Comput. Sci..