E cient Counting of Square Substrings in a Tree

We give an algorithm which in O(n log n) time counts all distinct squares in labeled trees. There are two main obstacles to overcome. Crochemore et al. showed in 2012 that the number of such squares is bounded by Θ(n). This is substantialy di erent from the case of classical strings, which admit only a linear number of distinct squares. We deal with this di culty by introducing a compact representation of all squares (based on maximal cyclic shifts) that requires only O(n logn) space. The second obstacle is lack of adequate algorithmic tools for labeled trees. Consequently we develop several novel techniques, which form the most complex part of the paper. In particular we extend Imre Simon's implementation of the failure function in pattern matching machines.

[1]  Sandi Klavzar,et al.  Nonrepetitive colorings of trees , 2007, Discret. Math..

[2]  Michael G. Main,et al.  An O(n log n) Algorithm for Finding All Repetitions in a String , 1984, J. Algorithms.

[3]  Wojciech Rytter,et al.  Repetitions in strings: Algorithms and combinatorics , 2009, Theor. Comput. Sci..

[4]  S. Rao Kosaraju,et al.  Efficient tree pattern matching , 1989, 30th Annual Symposium on Foundations of Computer Science.

[5]  Robert E. Tarjan,et al.  Fast Algorithms for Finding Nearest Common Ancestors , 1984, SIAM J. Comput..

[6]  Wojciech Rytter,et al.  The Maximum Number of Squares in a Tree , 2012, CPM.

[7]  Aviezri S. Fraenkel,et al.  How Many Squares Can a String Contain? , 1998, J. Comb. Theory, Ser. A.

[8]  Wojciech Rytter,et al.  Extracting Powers and Periods in a String from Its Runs Structure , 2010, SPIRE.

[9]  Michael A. Bender,et al.  The Level Ancestor Problem simplified , 2004, Theor. Comput. Sci..

[10]  Tetsuo Shibuya Constructing the Suffix Tree of a Tree with a Large Alphabet , 1999, ISAAC.

[11]  Imre Simon String Matching Algorithms and Automata , 1994, Results and Trends in Theoretical Computer Science.

[12]  Michael A. Bender,et al.  The Level Ancestor Problem Simplified , 2002, LATIN.

[13]  Wojciech Rytter,et al.  Jewels of stringology , 2002 .