Succinct indexable dictionaries with applications to encoding k-ary trees and multisets

We consider the <i>indexable dictionary</i> problem which consists in storing a set <i>S</i> ⊆ {0,…, <i>m</i> - 1} for some integer <i>m,</i> while supporting the operations of <i>rank</i>(<i>x</i>), which returns the number of elements in <i>S</i> that are less than <i>x</i> if <i>x</i> ε <i>S,</i> and -1 otherwise; and <i>select</i>(<i>i</i>) which returns the <i>i</i>-th smallest element in <i>S.</i>We give a structure that supports both operations in <i>O</i>(1) time on the RAM model and requires <i>B</i>(<i>n,m</i>) + <i>o</i>(<i>n</i>) + <i>O</i>(lg lg <i>m</i>) bits to store a set of size <i>n,</i> where <i>B</i>(<i>n,m</i>) = ⌈lg (<inf><i>n</i></inf><sup><i>m</i></sup>)⌉ is the minimum number of bits required to store any <i>n</i>-element subset from a universe of size <i>m.</i> Previous dictionaries taking this space only supported (yes/no) membership queries in <i>O</i>(1) time. In the cell probe model we can remove the <i>O</i>(lg lg <i>m</i>) additive term in the space bound, answering a question raised by Fich and Miltersen, and Pagh.We also present two applications of our dictionary structure:• An information-theoretically optimal representation for <i>k-ary cardinal trees</i> (aka <i>k</i>-ary tries). Our structure uses <i>C</i>(<i>n,k</i>) + <i>o</i>(<i>n</i> + lg <i>k</i>) bits to store a <i>k</i>-ary tree with <i>n</i> nodes and can support parent, <i>i</i>-th child, child labeled <i>i,</i> and the degree of a node in constant time, where <i>C</i>(<i>n,k</i>) is the minimum number of bits to store any <i>n</i>-node <i>k</i>-ary tree. Previous space efficient representations for cardinal <i>k</i>-ary trees required <i>C</i>(<i>n,k</i>) + Ω(<i>n</i>) bits.• An optimal representation for multisets where (appropriate generalisations of) the <i>select</i> and <i>rank</i> operations can be supported in <i>O</i>(1) time. Our structure uses <i>B</i>(<i>n, m + n</i>) + <i>o</i>(<i>n</i>) + <i>O</i>(lg lg <i>m</i>) bits to represent a multiset of size <i>n</i> from an <i>m</i> element set; the first term is the minimum number of bits required to represent such a multiset.

[1]  Torben Hagerup,et al.  Sorting and Searching on the Word RAM , 1998, STACS.

[2]  David Richard Clark,et al.  Compact pat trees , 1998 .

[3]  János Komlós,et al.  Hash functions for priority queues , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[4]  S. Srinivasa Rao,et al.  Static Dictionaries Supporting Rank , 1999, ISAAC.

[5]  Jeanette P. Schmidt,et al.  The Spatial Complexity of Oblivious k-Probe Hash Functions , 2018, SIAM J. Comput..

[6]  János Komlós,et al.  Storing a sparse table with O(1) worst case access time , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[7]  Robert E. Tarjan,et al.  Storing a sparse table , 1979, CACM.

[8]  Rasmus Pagh,et al.  Low redundancy in dictionaries with O(1) worst case lookup time , 1998 .

[9]  Andrew Chi-Chih Yao,et al.  Should Tables Be Sorted? , 1981, JACM.

[10]  S. Srinivasa Rao,et al.  Space Efficient Suffix Trees , 1998, J. Algorithms.

[11]  J. Ian Munro,et al.  Membership in Constant Time and Almost-Minimum Space , 1999, SIAM J. Comput..

[12]  Peter Bro Miltersen,et al.  Tables Should Be Sorted (On Random Access Machines) , 1995, WADS.

[13]  Ian H. Witten,et al.  Bonsai: A compact representation of trees , 1993, Softw. Pract. Exp..

[14]  Faith Ellen,et al.  Optimal bounds for the predecessor problem , 1999, STOC '99.

[15]  John Beidler,et al.  Data Structures and Algorithms , 1996, Wiley Encyclopedia of Computer Science and Engineering.

[16]  David R. Clark,et al.  Efficient suffix trees on secondary storage , 1996, SODA '96.

[17]  Peter Elias,et al.  Efficient Storage and Retrieval by Content and Address of Static Files , 1974, JACM.

[18]  Rajeev Raman,et al.  Representing Trees of Higher Degree , 2005, Algorithmica.

[19]  Erik D. Demaine,et al.  Representing Trees of Higer Degree , 1999, WADS.

[20]  Guy Joseph Jacobson,et al.  Succinct static data structures , 1988 .