论文信息 - Substring Suffix Selection

Substring Suffix Selection

We study the following substring suffix selection problem: given a substring of a string T of length n, compute its k-th lexicographically smallest suffix. This a natural generalization of the well-known question of computing the maximal suffix of a string, which is a basic ingredient in many other problems. We first revisit two special cases of the problem, introduced by Babenko, Kolesnichenko and Starikovskaya [CPM'13], in which we are asked to compute the minimal non-empty and the maximal suffixes of a substring. For the maximal suffixes problem, we give a linear-space structure with O(1) query time and linear preprocessing time, i.e., we manage to achieve optimal construction and optimal query time simultaneously. For the minimal suffix problem, we give a linear-space data structure with O(\tau) query time and O(n log n / \tau) preprocessing time, where 1 <= \tau <= log n is a parameter of the data structure. As a sample application, we show that this data structure can be used to compute the Lyndon decomposition of any substring of T in O(k \tau) time, where k is the number of distinct factors in the decomposition. Finally, we move to the general case of the substring suffix selection problem, where using any combinatorial properties seems more difficult. Nevertheless, we develop a linear-space data structure with O(log^{2+\epsilon} n) query time.

Maxim A. Babenko | Pawel Gawrychowski | Tomasz Kociumaka | Tatiana Starikovskaya

[1] Maxime Crochemore,et al. Algorithms on strings , 2007 .

[2] Michael A. Bender,et al. The LCA Problem Revisited , 2000, LATIN.

[3] William F. Smyth,et al. A taxonomy of suffix array construction algorithms , 2007, CSUR.

[4] Moshe Lewenstein,et al. Generalized Substring Compression , 2009, CPM.

[5] Jean Pierre Duval,et al. Factorizing Words over an Ordered Alphabet , 1983, J. Algorithms.

[6] Maxim A. Babenko,et al. On Minimal and Maximal Suffixes of a Substring , 2013, CPM.

[7] Philip Bille,et al. Substring Range Reporting , 2011, CPM.

[8] Graham Cormode,et al. Substring compression problems , 2005, SODA '05.

[9] M. Crochemore,et al. Algorithms on Strings: Tools , 2007 .

[10] Gianni Franceschini,et al. Optimal suffix selection , 2007, STOC '07.

[11] Joseph JáJá,et al. Space-Efficient and Fast Algorithms for Multidimensional Dominance Reporting and Counting , 2004, ISAAC.

[12] R. Lyndon,et al. Free Differential Calculus, IV. The Quotient Groups of the Lower Central Series , 1958 .

[13] Wojciech Rytter,et al. Text Algorithms , 1994 .

[14] Alfred V. Aho,et al. The Design and Analysis of Computer Algorithms , 1974 .

[15] Wojciech Rytter,et al. Efficient Data Structures for the Factor Periodicity Problem , 2012, SPIRE.