Optimal Lower and Upper Bounds for Representing Sequences

Sequence representations supporting the queries access, select, and rank are at the core of many data structures. There is a considerable gap between the various upper bounds and the few lower bounds known for such representations, and how they relate to the space used. In this article, we prove a strong lower bound for rank, which holds for rather permissive assumptions on the space used, and give matching upper bounds that require only a compressed representation of the sequence. Within this compressed space, the operations access and select can be solved in constant or almost-constant time, which is optimal for large alphabets. Our new upper bounds dominate all of the previous work in the time/space map.

[1]  Alexander Golynski Optimal lower bounds for rank and select indexes , 2007, Theor. Comput. Sci..

[2]  Gonzalo Navarro,et al.  Alphabet-Independent Compressed Text Indexing , 2011, TALG.

[3]  Torsten Suel,et al.  To index or not to index: time-space trade-offs in search engines with positional ranking functions , 2012, SIGIR '12.

[4]  Gonzalo Navarro,et al.  Fast In-Memory XPath Search over Compressed Text and Tree Indexes , 2009, ArXiv.

[5]  Gonzalo Navarro,et al.  Compressed Representation of Web and Social Networks via Dense Subgraphs , 2012, SPIRE.

[6]  Rajeev Raman,et al.  Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets , 2007, ACM Trans. Algorithms.

[7]  Alexander Golynski,et al.  Cell probe lower bounds for succinct data structures , 2009, SODA.

[8]  Wing-Kai Hon,et al.  Compressed data structures: dictionaries and data-aware measures , 2006, Data Compression Conference (DCC'06).

[9]  Paolo Ferragina,et al.  A simple storage scheme for strings achieving entropy bounds , 2007, SODA '07.

[10]  Wing-Kai Hon,et al.  Compressed data structures: Dictionaries and data-aware measures , 2007, Theor. Comput. Sci..

[11]  S. Srinivasa Rao,et al.  Succinct indexes for strings, binary relations and multilabeled trees , 2011, TALG.

[12]  Mikkel Thorup,et al.  Time-space trade-offs for predecessor search , 2006, STOC '06.

[13]  Gonzalo Navarro,et al.  New Lower and Upper Bounds for Representing Sequences , 2011, ESA.

[14]  Giovanni Manzini,et al.  An analysis of the Burrows-Wheeler transform , 2001, SODA '99.

[15]  Kunihiko Sadakane,et al.  Practical Entropy-Compressed Rank/Select Dictionary , 2006, ALENEX.

[16]  Rajeev Raman,et al.  On the Redundancy of Succinct Data Structures , 2008, SWAT.

[17]  Gonzalo Navarro,et al.  Efficient Fully-Compressed Sequence Representations , 2012, Algorithmica.

[18]  S. Srinivasa Rao,et al.  Rank/select operations on large alphabets: a tool for text indexing , 2006, SODA '06.

[19]  Emanuele Viola,et al.  Cell-probe lower bounds for succinct partial sums , 2010, SODA '10.

[20]  Gonzalo Navarro,et al.  Rank and select revisited and extended , 2007, Theor. Comput. Sci..

[21]  Gonzalo Navarro,et al.  Extended Compact Web Graph Representations , 2010, Algorithms and Applications.

[22]  Gonzalo Navarro,et al.  Colored range queries and document retrieval , 2010, Theor. Comput. Sci..

[23]  Veli Mäkinen,et al.  Space-Efficient Algorithms for Document Retrieval , 2007, CPM.

[24]  Gonzalo Navarro,et al.  Compressed representations of sequences and full-text indexes , 2007, TALG.

[25]  Gonzalo Navarro,et al.  Compressed Representations of Permutations, and Applications , 2009, STACS.

[26]  John L. Smith Tables , 1969, Neuromuscular Disorders.

[27]  Fabrizio Luccio,et al.  Compressing and indexing labeled trees, with applications , 2009, JACM.

[28]  David Richard Clark,et al.  Compact pat trees , 1998 .

[29]  Diego Arroyuelo,et al.  Compressed Self-indices Supporting Conjunctive Queries on Document Collections , 2010, SPIRE.

[30]  Rajeev Raman,et al.  Succinct Representations of Permutations , 2003, ICALP.

[31]  Guy Jacobson,et al.  Space-efficient static trees and graphs , 1989, 30th Annual Symposium on Foundations of Computer Science.

[32]  Gonzalo Navarro,et al.  Implicit indexing of natural language text by reorganizing bytecodes , 2012, Information Retrieval.

[33]  Rajeev Raman,et al.  Optimal Trade-Offs for Succinct String Indexes , 2010, ICALP.

[34]  Travis Gagie,et al.  Large alphabets and incompressibility , 2005, Inf. Process. Lett..

[35]  Giovanni Manzini,et al.  Indexing compressed text , 2005, JACM.

[36]  Wing-Kai Hon,et al.  Dynamic Rank/Select Dictionaries with Applications to XML Indexing , 2006 .

[37]  Sebastiano Vigna,et al.  Monotone minimal perfect hashing: searching a sorted table with O(1) accesses , 2009, SODA.

[38]  Roberto Grossi,et al.  High-order entropy-compressed text indexes , 2003, SODA '03.