A New Succinct Representation of RMQ-Information and Improvements in the Enhanced Suffix Array

The Range-Minimum-Query-Problem is to preprocess an array of length n in O(n) time such that all subsequent queries asking for the position of a minimal element between two specified indices can be obtained in constant time. This problem was first solved by Berkman and Vishkin [1], and Sadakane [2] gave the first succinct data structure that uses 4n+o(n) bits of additional space. In practice, this method has several drawbacks: it needs O(n log n) bits of intermediate space when constructing the data structure, and it builds on previous results on succinct data structures. We overcome these problems by giving the first algorithm that never uses more than 2n + o(n) bits, and does not rely on rank- and select-queries or other succinct data structures. We stress the importance of this result by simplifying and reducing the space consumption of the Enhanced Suffix Array [3], while retaining its capability of simulating top-down-traversals of the suffix tree, used, e.g., to locate all occ positions of a pattern p in a text in optimal O(|p| + occ) time (assuming constant alphabet size). We further prove a lower bound of 2n - o(n) bits, which makes our algorithm asymptotically optimal.

[1]  Robert E. Tarjan,et al.  An Efficient Parallel Biconnectivity Algorithm , 2011, SIAM J. Comput..

[2]  Enno Ohlebusch,et al.  Replacing suffix trees with enhanced suffix arrays , 2004, J. Discrete Algorithms.

[3]  Stephen Alstrup,et al.  Nearest common ancestors: a survey and a new distributed algorithm , 2002, SPAA.

[4]  Uzi Vishkin,et al.  Recursive Star-Tree Parallel Data Structure , 1993, SIAM J. Comput..

[5]  Gonzalo Navarro,et al.  Compressed full-text indexes , 2007, CSUR.

[6]  Steven Skiena,et al.  Lowest common ancestors in trees and directed acyclic graphs , 2005, J. Algorithms.

[7]  Robert E. Tarjan,et al.  Scaling and related techniques for geometry problems , 1984, STOC '84.

[8]  Guy Jacobson,et al.  Space-efficient static trees and graphs , 1989, 30th Annual Symposium on Foundations of Computer Science.

[9]  Hiroki Arimura,et al.  Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications , 2001, CPM.

[10]  Kunihiko Sadakane,et al.  Compressed Suffix Trees with Full Functionality , 2007, Theory of Computing Systems.

[11]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.

[12]  S. Muthukrishnan,et al.  Efficient algorithms for document retrieval problems , 2002, SODA '02.

[13]  Kuan-Yu Chen,et al.  On the range maximum-sum segment query problem , 2007, Discret. Appl. Math..

[14]  Andrew Chi-Chih Yao,et al.  Should Tables Be Sorted? , 1981, JACM.

[15]  Volker Heun,et al.  Theoretical and Practical Improvements on the RMQ-Problem, with Applications to LCA and LCE , 2006, CPM.

[16]  Kunihiko Sadakane,et al.  Space-Efficient Data Structures for Flexible Text Retrieval Systems , 2002, ISAAC.

[17]  Kunihiko Sadakane,et al.  Succinct representations of lcp information and improvements in the compressed suffix arrays , 2002, SODA '02.