Periodicity and repetitions in parameterized strings

One of the most beautiful and useful notions in the Mathematical Theory of Strings is that of a Period, i.e., an initial piece of a given string that can generate that string by repeating itself at regular intervals. Periods have an elegant mathematical structure and a wealth of applications [F. Mignosi and A. Restivo, Periodicity, Algebraic Combinatorics on Words, in: M. Lothaire (Ed.), Cambridge University Press, Cambridge, pp. 237-274, 2002]. At the hearth of their theory, there are two Periodicity Lemmas: one due to Lyndon and Schutzenberger [The equation a^M=b^Nc^P in a free group, Michigan Math. J. 9 (1962) 289-298], referred to as the Weak Version, and the other due to Fine and Wilf [Uniqueness theorems for periodic functions, Proc. Amer. Math. Soc. 16 (1965) 109-114]. In this paper, we investigate the notion of periodicity and the closely related one of repetition in connection with parameterized strings as introduced by Baker [Parameterized pattern matching: algorithms and applications, J. Comput. System Sci. 52(1) (1996) 28-42; Parameterized duplication in strings: algorithms and an application to software maintenance, SIAM J. Comput. 26(5) (1997) 1343-1362]. In such strings, the notion of pairwise match or ''equivalence'' of symbols is more relaxed than the usual one, in that it rests on some mapping, rather than identity, of symbols. It seems natural to try and extend notions of periods and periodicities to encompass parameterized strings. However, we know of no previous attempt in this direction. Our preliminary investigation yields results as follows. For periodicity, we get (a) a generalization of the Weak Version of the Periodicity Lemma for parameterized strings, showing that it is essential that the two mappings inducing the periodicity must commute; (b) a proof that an analogous of the Lemma by Fine and Wilf [Uniqueness theorems for periodic functions, Proc. Amer. Math. Soc. 16 (1965) 109-114] cannot hold for parameterized strings, even if the mappings inducing the periodicity ''commute'', in a sense to be specified below; (c) a proof that parameterized strings over an alphabet of at least three letters may have a set of periods which differ from those of any binary string of the same length-whereby the parameterized analog of a classic result by Guibas and Odlyzko [String overlaps, pattern matching, and nontransitive games, J. Combin. Theory Ser. A 30 (1981) 183-208] cannot hold. We also derive necessary and sufficient conditions characterizing parameterized repetitions, which are patterns of length at least twice that of the period, and show how the notion of root differs from the standard case, and highlight some of the implications on extending algorithmic criteria previously adopted for string searching, detection of repetitions and the likes. Finally, as a corollary of our main results, we also show that binary parameterized strings behave much in the same way as non-parameterized ones with respect to periodicity and repetitions, while there is a substantial difference for strings over alphabets of at least three symbols.

[1]  Filippo Mignosi,et al.  Generalizations of the Periodicity Theorem of Fine and Wilf , 1994, CAAP.

[2]  Brenda S. Baker Parameterized Pattern Matching: Algorithms and Applications , 1996, J. Comput. Syst. Sci..

[3]  Brenda S. Baker,et al.  Parameterized Duplication in Strings: Algorithms and an Application to Software Maintenance , 1997, SIAM J. Comput..

[4]  M. Schützenberger,et al.  The equation $a^M=b^Nc^P$ in a free group. , 1962 .

[5]  Franco P. Preparata,et al.  Structural Properties of the String Statistics Problem , 1985, J. Comput. Syst. Sci..

[6]  Z. Galil,et al.  Pattern matching algorithms , 1997 .

[7]  Moshe Lewenstein,et al.  Function Matching: Algorithms, Applications, and a Lower Bound , 2003, ICALP.

[8]  Franco P. Preparata,et al.  Optimal Off-Line Detection of Repetitions in a String , 1983, Theor. Comput. Sci..

[9]  H. Wilf,et al.  Uniqueness theorems for periodic functions , 1965 .

[10]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[11]  N. Rhee,et al.  A Characterization of the Centralizer of a Permutation , 1999 .

[12]  Raffaele Giancarlo,et al.  Sparse Dynamic Programming for Longest Common Subsequence from Fragments , 2002, J. Algorithms.

[13]  Brenda S. Baker Parameterized diff , 1999, SODA '99.

[14]  Leonidas J. Guibas,et al.  String Overlaps, Pattern Matching, and Nontransitive Games , 1981, J. Comb. Theory A.

[15]  S. Muthukrishnan,et al.  Alphabet Dependence in Parameterized Matching , 1994, Inf. Process. Lett..

[16]  Zvi Galil On improving the worst case running time of the Boyer-Moore string matching algorithm , 1979, CACM.

[17]  Raffaele Giancarlo,et al.  The Boyer-Moore-Galil String Searching Strategies Revisited , 1986, SIAM J. Comput..

[18]  Franco P. Preparata,et al.  Data structures and algorithms for the string statistics problem , 1996, Algorithmica.

[19]  Gary Benson,et al.  Two-Dimensional Periodicity in Rectangular Arrays , 1998, SIAM J. Comput..