Efficient Counting of Square Substrings in a Tree

We give an algorithm which in O(nlog2 n) time counts all distinct squares in labeled trees. There are two main obstacles to overcome. Crochemore et al. showed in 2012 that the number of such squares is bounded by Θ(n 4/3). This is substantialy different from the case of classical strings, which admit only a linear number of distinct squares. We deal with this difficulty by introducing a compact representation of all squares (based on maximal cyclic shifts) that requires only O(n logn) space. The second obstacle is lack of adequate algorithmic tools for labeled trees. Consequently we develop several novel techniques, which form the most complex part of the paper. In particular we extend Imre Simon’s implementation of the failure function in pattern matching machines.

[1]  Michael G. Main,et al.  An O(n log n) Algorithm for Finding All Repetitions in a String , 1984, J. Algorithms.

[2]  Frank Harary,et al.  Graph Theory , 2016 .

[3]  Sandi Klavzar,et al.  Nonrepetitive colorings of trees , 2007, Discret. Math..

[4]  Michael A. Bender,et al.  The Level Ancestor Problem simplified , 2004, Theor. Comput. Sci..

[5]  Tetsuo Shibuya Constructing the Suffix Tree of a Tree with a Large Alphabet , 1999, ISAAC.

[6]  Xuding Zhu,et al.  Nonrepetitive list colourings of paths , 2011, Random Struct. Algorithms.

[7]  S. Rao Kosaraju,et al.  Efficient tree pattern matching , 1989, 30th Annual Symposium on Foundations of Computer Science.

[8]  Robert E. Tarjan,et al.  Fast Algorithms for Finding Nearest Common Ancestors , 1984, SIAM J. Comput..

[9]  Michael A. Bender,et al.  The Level Ancestor Problem Simplified , 2002, LATIN.

[10]  Jens Stoye,et al.  Linear time algorithms for finding and representing all the tandem repeats in a string , 2004, J. Comput. Syst. Sci..

[11]  Wojciech Rytter,et al.  Extracting Powers and Periods in a String from Its Runs Structure , 2010, SPIRE.

[12]  Wojciech Rytter,et al.  Efficient counting of square substrings in a tree , 2014, Theor. Comput. Sci..

[13]  Lucian Ilie,et al.  A note on the number of squares in a word , 2007, Theor. Comput. Sci..

[14]  Noga Alon,et al.  Nonrepetitive colorings of graphs , 2002, Random Struct. Algorithms.

[15]  Imre Simon String Matching Algorithms and Automata , 1994, Results and Trends in Theoretical Computer Science.

[16]  Wojciech Rytter,et al.  Jewels of stringology , 2002 .

[17]  Wojciech Rytter,et al.  The Maximum Number of Squares in a Tree , 2012, CPM.

[18]  Wojciech Rytter,et al.  E cient Counting of Square Substrings in a Tree , 2012 .

[19]  Aviezri S. Fraenkel,et al.  How Many Squares Can a String Contain? , 1998, J. Comb. Theory, Ser. A.

[20]  Wojciech Rytter,et al.  Jewels of stringology : text algorithms , 2002 .

[21]  Lucian Ilie,et al.  A simple proof that a word of length n has at most 2n distinct squares , 2005, J. Comb. Theory, Ser. A.

[22]  Francine Blanchet-Sadri,et al.  Counting Distinct Squares in Partial Words , 2009, Acta Cybern..

[23]  S. Rao Kosaraju,et al.  Efficient Tree Pattern Matching (Preliminary Version) , 1989, FOCS 1989.

[24]  Wojciech Rytter,et al.  Repetitions in strings: Algorithms and combinatorics , 2009, Theor. Comput. Sci..