Efficiently computing runs on a trie

Abstract A maximal repetition, or run, in a string, is a maximal periodic substring whose smallest period is at most half the length of the substring. In this paper, we consider runs that correspond to a path on a trie, or in other words, on a rooted edge-labeled tree where each edge is labeled with a single symbol, and the endpoints of the path must be a descendant/ancestor of the other. For a trie with n edges, we show that the number of runs is less than n. We also show an asymptotic lower bound on the maximum density of runs in tries: lim n → ∞ ⁡ ρ T ( n ) / n > 0.9932348 where ρ T ( n ) is the maximum number of runs in a trie with n edges. Furthermore, we also show an O ( n log ⁡ log ⁡ n ) time and O ( n ) space algorithm for finding all runs.

[1]  Wojciech Rytter,et al.  The Maximum Number of Squares in a Tree , 2012, CPM.

[2]  Michael A. Bender,et al.  The LCA Problem Revisited , 2000, LATIN.

[3]  Wojciech Rytter,et al.  String Powers in Trees , 2016, Algorithmica.

[4]  Gregory Kucherov,et al.  Periodic Structures in Words , 2004 .

[5]  Robert E. Tarjan,et al.  A Linear-Time Algorithm for a Special Case of Disjoint Set Union , 1985, J. Comput. Syst. Sci..

[6]  Johannes Fischer,et al.  Linear Time Runs over General Ordered Alphabets , 2021, ICALP.

[7]  Kazuya Tsuruta,et al.  The "Runs" Theorem , 2017, SIAM J. Comput..

[8]  Wojciech Rytter,et al.  Efficient counting of square substrings in a tree , 2014, Theor. Comput. Sci..

[9]  Hideo Bannai,et al.  Computing Maximal Palindromes and Distinct Palindromes in a Trie , 2019, Stringology.

[10]  Dany Breslauer The Suffix Tree of a Tree and Minimizing Sequential Transducers , 1998, Theor. Comput. Sci..

[11]  Hideo Bannai,et al.  Computing runs on a trie , 2019, CPM.

[12]  Gregory Kucherov,et al.  Finding maximal repetitions in a word in linear time , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[13]  Michael A. Bender,et al.  The Level Ancestor Problem simplified , 2004, Theor. Comput. Sci..

[14]  Tetsuo Shibuya Constructing the Suffix Tree of a Tree with a Large Alphabet , 1999, ISAAC.

[15]  Volker Heun,et al.  Theoretical and Practical Improvements on the RMQ-Problem, with Applications to LCA and LCE , 2006, CPM.

[16]  Wojciech Rytter,et al.  Extracting powers and periods in a word from its runs structure , 2014, Theor. Comput. Sci..

[17]  Frantisek Franek,et al.  Algorithms to Compute the Lyndon Array , 2016, Stringology.

[18]  Gad M. Landau,et al.  Longest common extensions in trees , 2016, Theor. Comput. Sci..

[19]  Jamie Simpson Modified Padovan words and the maximum number of runs in a word , 2010, Australas. J Comb..

[20]  Wojciech Rytter,et al.  The maximal number of cubic runs in a word , 2012, J. Comput. Syst. Sci..

[21]  Alexandru I. Tomescu,et al.  Genome-Scale Algorithm Design: Biological Sequence Analysis in the Era of High-Throughput Sequencing , 2015 .