Efficient counting of square substrings in a tree

We give an algorithm which in O(nlog^2n) time counts all distinct squares in a labeled tree. There are two main obstacles to overcome. The first one is that the number of distinct squares in a tree is @W(n^4^/^3) (see Crochemore et al., 2012 [7]), which differs substantially from the case of classical strings for which there are only linearly many distinct squares. We overcome this obstacle by using a compact representation of all squares (based on maximal cyclic shifts) which requires only O(nlogn) space. The second obstacle is lack of adequate algorithmic tools for labeled trees, consequently we design several novel tools, this is the most complex part of the paper. In particular we extend to trees Imre Simon's compact representations of the failure table in pattern matching machines.

[1]  Noga Alon,et al.  Nonrepetitive colorings of graphs , 2005, Electron. Notes Discret. Math..

[2]  Frank Harary,et al.  Graph Theory , 2016 .

[3]  S. Rao Kosaraju,et al.  Efficient Tree Pattern Matching (Preliminary Version) , 1989, FOCS 1989.

[4]  Michael G. Main,et al.  An O(n log n) Algorithm for Finding All Repetitions in a String , 1984, J. Algorithms.

[5]  Xuding Zhu,et al.  Nonrepetitive list colourings of paths , 2011, Random Struct. Algorithms.

[6]  S. Rao Kosaraju,et al.  Efficient tree pattern matching , 1989, 30th Annual Symposium on Foundations of Computer Science.

[7]  Wojciech Rytter,et al.  Jewels of stringology , 2002 .

[8]  Wojciech Rytter,et al.  Jewels of stringology : text algorithms , 2002 .

[9]  Robert E. Tarjan,et al.  Fast Algorithms for Finding Nearest Common Ancestors , 1984, SIAM J. Comput..

[10]  Wojciech Rytter,et al.  The Maximum Number of Squares in a Tree , 2012, CPM.

[11]  Wojciech Rytter,et al.  Extracting Powers and Periods in a String from Its Runs Structure , 2010, SPIRE.

[12]  Aviezri S. Fraenkel,et al.  How Many Squares Can a String Contain? , 1998, J. Comb. Theory, Ser. A.

[13]  Sandi Klavzar,et al.  Nonrepetitive colorings of trees , 2007, Discret. Math..

[14]  Michael A. Bender,et al.  The Level Ancestor Problem simplified , 2004, Theor. Comput. Sci..

[15]  Tetsuo Shibuya Constructing the Suffix Tree of a Tree with a Large Alphabet , 1999, ISAAC.

[16]  Jens Stoye,et al.  Linear time algorithms for finding and representing all the tandem repeats in a string , 2004, J. Comput. Syst. Sci..

[17]  Wojciech Rytter,et al.  Repetitions in strings: Algorithms and combinatorics , 2009, Theor. Comput. Sci..

[18]  Lucian Ilie,et al.  A simple proof that a word of length n has at most 2n distinct squares , 2005, J. Comb. Theory, Ser. A.

[19]  Francine Blanchet-Sadri,et al.  Counting Distinct Squares in Partial Words , 2009, Acta Cybern..

[20]  Lucian Ilie,et al.  A note on the number of squares in a word , 2007, Theor. Comput. Sci..

[21]  Imre Simon String Matching Algorithms and Automata , 1994, Results and Trends in Theoretical Computer Science.

[22]  Wojciech Rytter,et al.  Efficient Counting of Square Substrings in a Tree , 2012, ISAAC.