Optimal Trade-Offs for Succinct String Indexes

Let s be a string whose symbols are solely available through access(i), a read-only operation that probes s and returns the symbol at position i in s. Many compressed data structures for strings, trees, and graphs, require two kinds of queries on s: select(c, j), returning the position in s containing the jth occurrence of c, and rank(c, p), counting how many occurrences of c are found in the first p positions of s. We give matching upper and lower bounds for this problem. The main contribution is to introduce a general technique for proving lower bounds on succinct data structures, that is based on the access patterns of the supported operations, abstracting from the particular operations at hand.

[1]  W. Marsden I and J , 2012 .

[2]  Paolo Ferragina,et al.  A simple storage scheme for strings achieving entropy bounds , 2007, SODA '07.

[3]  S. Srinivasa Rao,et al.  Succinct indexes for strings, binary relations and multi-labeled trees , 2007, SODA '07.

[4]  Peter Bro Miltersen Lower bounds on the size of selection and rank indexes , 2005, SODA '05.

[5]  Robin Milner,et al.  On Observing Nondeterminism and Concurrency , 1980, ICALP.

[6]  S. Srinivasa Rao,et al.  Rank/select operations on large alphabets: a tool for text indexing , 2006, SODA '06.

[7]  Roger L. Freeman Wiley Series in Telecommunications and Signal Processing , 2005 .

[8]  Peter van Emde Boas,et al.  Design and implementation of an efficient priority queue , 1976, Mathematical systems theory.

[9]  Rajeev Raman,et al.  On the Size of Succinct Indices , 2007, ESA.

[10]  Erik D. Demaine,et al.  A linear lower bound on index size for text retrieval , 2001, SODA '01.

[11]  Alexander Golynski Optimal Lower Bounds for Rank and Select Indexes , 2006, ICALP.

[12]  Rajeev Raman,et al.  More Haste, Less Waste: Lowering the Redundancy in Fully Indexable Dictionaries , 2009, STACS.

[13]  Rodrigo González,et al.  Statistical Encoding of Succinct Data Structures , 2006, CPM.

[14]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[15]  Rajeev Raman,et al.  Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets , 2007, ACM Trans. Algorithms.

[16]  MiltersenPeter Bro,et al.  The cell probe complexity of succinct data structures , 2007 .

[17]  Alexander Golynski,et al.  Upper and Lower Bounds for Text Upper and Lower Bounds for Text Indexing Data Structures , 2008 .

[18]  Thomas M. Cover,et al.  Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) , 2006 .

[19]  Rajeev Raman,et al.  Succinct Representations of Permutations , 2003, ICALP.

[20]  Sebastiano Vigna,et al.  Monotone minimal perfect hashing: searching a sorted table with O(1) accesses , 2009, SODA.

[21]  Alexander Golynski,et al.  Cell probe lower bounds for succinct data structures , 2009, SODA.

[22]  Faith Ellen,et al.  Optimal Bounds for the Predecessor Problem and Related Problems , 2002, J. Comput. Syst. Sci..

[23]  Mikkel Thorup,et al.  Time-space trade-offs for predecessor search , 2006, STOC '06.

[24]  Peter Bro Miltersen,et al.  The Cell Probe Complexity of Succinct Data Structures , 2003 .

[25]  Dan E. Willard Log-Logarithmic Worst-Case Range Queries are Possible in Space Theta(N) , 1983, Inf. Process. Lett..

[26]  Roberto Grossi,et al.  Squeezing succinct data structures into entropy bounds , 2006, SODA '06.