Efficient algorithms for shortest partial seeds in words

Abstract A factor u of a word w is a cover of w if every position in w lies within some occurrence of u in w . A factor u is a seed of w if it is a cover of a superstring of w . Covers and seeds extend the classical notions of periodicity. We introduce a new notion of α - partial seed , that is, a factor covering as a seed at least α positions in a given word. We use the Cover Suffix Tree, recently introduced in the context of α - partial covers (Kociumaka et al., 2015, [13] ); an O ( n log ⁡ n ) -time algorithm constructing such a tree is known. However, it appears that partial seeds are more complicated than partial covers—our algorithms require algebraic manipulations of special functions related to edges of the modified Cover Suffix Tree and the border array. We present a procedure for computing shortest α -partial seeds that works in O ( n ) time if the Cover Suffix Tree is already given. This is a full version, which includes all the proofs, of a paper that appeared at CPM 2014 [1] .

[1]  Michael A. Bender,et al.  The LCA Problem Revisited , 2000, LATIN.

[2]  Kunsoo Park,et al.  Finding Approximate Covers of Strings , 2002 .

[3]  Andrzej Ehrenfeucht,et al.  Efficient Detection of Quasiperiodicities in Strings , 1993, Theor. Comput. Sci..

[4]  Wojciech Rytter,et al.  Efficient Algorithms for Shortest Partial Seeds in Words , 2014, CPM.

[5]  Wojciech Rytter,et al.  Repetitions in strings: Algorithms and combinatorics , 2009, Theor. Comput. Sci..

[6]  Wojciech Rytter,et al.  Jewels of stringology , 2002 .

[7]  Jeong Seop Sim,et al.  Approximate Seeds of Strings , 2003, Stringology.

[8]  Maxime Crochemore,et al.  Algorithms on strings , 2007 .

[9]  Wojciech Rytter,et al.  A Linear-Time Algorithm for Seeds Computation , 2011, SODA.

[10]  William F. Smyth,et al.  An Optimal Algorithm to Compute all the Covers of a String , 1994, Inf. Process. Lett..

[11]  Costas S. Iliopoulos,et al.  Covering a string , 2005, Algorithmica.

[12]  Robert E. Tarjan,et al.  A Linear-Time Algorithm for a Special Case of Disjoint Set Union , 1985, J. Comput. Syst. Sci..

[13]  Robert E. Tarjan,et al.  Fast Algorithms for Finding Nearest Common Ancestors , 1984, SIAM J. Comput..

[14]  Dany Breslauer,et al.  An On-Line String Superprimitivity Test , 1992, Inf. Process. Lett..

[15]  Yin Li,et al.  Computing the Cover Array in Linear Time , 2001, Algorithmica.

[16]  William F. Smyth,et al.  A Correction to "An Optimal Algorithm to Compute all the Covers of a String" , 1995, Inf. Process. Lett..

[17]  Costas S. Iliopoulos,et al.  Enhanced string covering , 2013, Theor. Comput. Sci..

[18]  Costas S. Iliopoulos,et al.  Optimal Superprimitivity Testing for Strings , 1991, Inf. Process. Lett..

[19]  Wojciech Rytter,et al.  Fast Algorithm for Partial Covers in Words , 2014, Algorithmica.