A Faster Algorithm for Computing Maximal \alpha -gapped Repeats in a String

A string $$x = uvu$$ with both u,i¾źv being non-empty is called a gapped repeat with period$$p = |uv|$$, and is denoted by pair x,i¾źp. If $$p \le \alpha |x|-p$$ with $$\alpha > 1$$, then x,i¾źp is called an $$\alpha $$-gapped repeat. An occurrence $$[i, i+|x|-1]$$ of an $$\alpha $$-gapped repeat x,i¾źp in a string w is called a maximal$$\alpha $$-gapped repeat of w, if it cannot be extended either to the left or to the right in w with the same period p. Kolpakov et al. CPM 2014 showed that, given a string of length n over a constant alphabet, all the occurrences of maximal $$\alpha $$-gapped repeats in the string can be computed in $$O\alpha ^2 n + occ $$ time, where $$ occ $$ is the number of occurrences. In this paper, we propose a faster $$O\alpha n + occ $$-time algorithm to solve this problem, improving the result of Kolpakov et al. by a factor of $$\alpha $$.

[1]  Jens Stoye,et al.  Linear time algorithms for finding and representing all the tandem repeats in a string , 2004, J. Comput. Syst. Sci..

[2]  Wojciech Rytter,et al.  Text Algorithms , 1994 .

[3]  Michael A. Bender,et al.  The LCA Problem Revisited , 2000, LATIN.

[4]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[5]  Jens Stoye,et al.  Finding Maximal Pairs with Bounded Gap , 1999, CPM.

[6]  Maxime Crochemore,et al.  Computing the Maximal-Exponent Repeats of an Overlap-Free String in Linear Time , 2012, SPIRE.

[7]  Gregory Kucherov,et al.  Finding repeats with fixed gap , 2000, Proceedings Seventh International Symposium on String Processing and Information Retrieval. SPIRE 2000.

[8]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[9]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.

[10]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[11]  Robert E. Tarjan,et al.  A Linear-Time Algorithm for a Special Case of Disjoint Set Union , 1985, J. Comput. Syst. Sci..

[12]  Peter Weiner,et al.  Linear Pattern Matching Algorithms , 1973, SWAT.

[13]  Esko Ukkonen,et al.  On-line construction of suffix trees , 1995, Algorithmica.

[14]  David Haussler,et al.  The Smallest Automaton Recognizing the Subwords of a Text , 1985, Theor. Comput. Sci..

[15]  Mikhail Posypkin,et al.  Searching of Gapped Repeats and Subrepetitions in a Word , 2014, CPM.

[16]  Maxime Crochemore,et al.  Transducers and Repetitions , 1986, Theor. Comput. Sci..