On the Combinatorics of Finite Words

In this paper we consider a combinatorial method for the analysis of finite words recently introduced in Colosimo and de Luca (Special factors in biological strings, preprint 97/42, Dipt. Matematica, Univ. di Roma) for the study of biological macromolecules. The method is based on the analysis of (right) special factors of a given word. A factor u of a word w is special if there exist at least two occurrences of the factor u in w followed on the right by two distinct letters. We show that in the combinatorics of finite words two parameters play an essential role. The first, denoted by R, represents the minimal integer such that there do not exist special factors of w of length R. The second, that we denote by K, is the minimal length of a factor of w which cannot be extended on the right in a factor of w. Some new results are proved. In particular, a new characterization in terms of special factors and of R and K is given for the set PER of all words w having two periods p and q which are coprimes and such that ¦w¦ = p + q − 2.

[1]  Enrico Bombieri,et al.  Which distributions of matter diffract? An initial investigation , 1986 .

[2]  Julien Cassaigne,et al.  Complexité et facteurs spéciaux , 1997 .

[3]  Antal Iványi,et al.  On the d-complexity of words , 1987 .

[4]  Aldo de Luca,et al.  Sturmian Words, Lyndon Words and Trees , 1997, Theor. Comput. Sci..

[5]  Aldo de Luca,et al.  Regularity and Finiteness Conditions , 1997, Handbook of Formal Languages.

[6]  M. Lothaire,et al.  Combinatorics on words: Frontmatter , 1997 .

[7]  Aldo de Luca,et al.  Some Combinatorial Properties of the Thue-Morse Sequence and a Problem in Semigroups , 1989, Theor. Comput. Sci..

[8]  Aldo de Luca,et al.  On the Factors of the Thue-Morse Word on Three Symbols , 1988, Inf. Process. Lett..

[9]  Aldo de Luca,et al.  On Bispecial Factors of the Thue-Morse Word , 1994, Inf. Process. Lett..

[10]  J. Berstel,et al.  Theory of codes , 1985 .

[11]  Aldo de Luca,et al.  A combinatorial theorem on p-power-free words and an application to semigroups , 1990, RAIRO Theor. Informatics Appl..

[12]  Aldo de Luca,et al.  Sturmian Words: Structure, Combinatorics, and Their Arithmetics , 1997, Theor. Comput. Sci..

[13]  Jean-Paul Allouche,et al.  Sur la complexite des suites in nies , 1994 .

[14]  Filippo Mignosi,et al.  Some Combinatorial Properties of Sturmian Words , 1994, Theor. Comput. Sci..

[15]  Antonio Restivo,et al.  Minimal Forbidden Words and Symbolic Dynamics , 1996, STACS.

[16]  Jeffrey Shallit,et al.  On the maximum number of distinct factors of a binary string , 1993, Graphs Comb..