A General Framework for Dynamic Succinct and Compressed Data Structures

Succinct data structures are becoming increasingly popular in big data processing applications due to their low memory consumption. However, a feature that is currently lacking from most implementations of succinct data structures is dynamism. In this paper we design, implement, and test a general framework that allows for practical dynamic succinct structures. Firstly, a key component of our approach is careful memory management, which is often overlooked in the succinct data structures literature. Most succinct data structures allocate and deallocate relatively small data blocks each time a modify, insert, or delete operation occurs. We demonstrate experimentally that the space cost of neglecting memory management can be over 25% for dynamic data structures of this type. Secondly, using our memory management approach, we describe implementations of compressed modifiable bit vectors, and extended compressed random access memory (recently proposed by Jansson, Sadakane, and Sung [ICALP 2012]). Finally, we implement and test our data structures using several popular compression libraries, and both synthetic data (for the compressed modifiable bit vector) and a real-world temporal graph (for the extended compressed random access memory). Our data structures provide an easy to use interface that allow standard algorithms (in our example, breadth-first search in a graph) to be ∗Universitat des Saarlandes, Saarbrucken, Germany; Email: philologos14@gmail.com; Corresponding Author †Bell Labs, Dublin, Ireland; Email: pat.nicholson@alcatel-lucent.com run on top of the compressed data, decreasing memory consumption at the expense of running time.

[1]  Robert E. Tarjan,et al.  Self-adjusting binary search trees , 1985, JACM.

[2]  Rajeev Raman,et al.  Dynamic Compressed Strings with Random Access , 2013, ICALP.

[3]  Amr Elmasry,et al.  Improved Address-Calculation Coding of Integer Arrays , 2012, SPIRE.

[4]  Ion Stoica,et al.  Succinct: Enabling Queries on Compressed Data , 2015, NSDI.

[5]  Robert E. Tarjan,et al.  Rank-Balanced Trees , 2015, TALG.

[6]  Erik D. Demaine,et al.  Resizable Arrays in Optimal Time and Space , 1999, WADS.

[7]  Gonzalo Navarro,et al.  Improved Single-Term Top-k Document Retrieval , 2015, ALENEX.

[8]  J. IAN MUNRO,et al.  An Implicit Data Structure Supporting Insertion, Deletion, and Search in O(log² n) Time , 1986, J. Comput. Syst. Sci..

[9]  Michael E. Saks,et al.  The cell probe complexity of dynamic data structures , 1989, STOC '89.

[10]  Rajeev Raman,et al.  Succinct Dynamic Dictionaries and Trees , 2003, ICALP.

[11]  Allan Borodin,et al.  Online computation and competitive analysis , 1998 .

[12]  Stelios Joannou,et al.  Dynamizing Succinct Tree Representations , 2012, SEA.

[13]  Gonzalo Navarro,et al.  Compressed Dynamic Binary Relations , 2012, 2012 Data Compression Conference.

[14]  Stelios Joannou,et al.  An Empirical Evaluation of Extendible Arrays , 2011, SEA.

[15]  Venkatesh Raman,et al.  Representing dynamic binary trees succinctly , 2001, SODA '01.

[16]  Guy Jacobson,et al.  Space-efficient static trees and graphs , 1989, 30th Annual Symposium on Foundations of Computer Science.

[17]  Rajeev Raman,et al.  Succinct Dynamic Data Structures , 2001, WADS.

[18]  Kunihiko Sadakane,et al.  CRAM: Compressed Random Access Memory , 2010, ICALP.

[19]  Alistair Moffat,et al.  From Theory to Practice: Plug and Play with Succinct Data Structures , 2013, SEA.

[20]  Gonzalo Navarro,et al.  Succinct Trees in Practice , 2010, ALENEX.

[21]  M. AdelsonVelskii,et al.  AN ALGORITHM FOR THE ORGANIZATION OF INFORMATION , 1963 .

[22]  Gonzalo Navarro,et al.  Dynamic entropy-compressed sequences and full-text indexes , 2006, TALG.

[23]  Kunihiko Sadakane,et al.  Fully Functional Static and Dynamic Succinct Trees , 2009, TALG.

[24]  Leonidas J. Guibas,et al.  A dichromatic framework for balanced trees , 1978, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[25]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[26]  Susana Ladra,et al.  Practical representations for web and social graphs , 2011, CIKM '11.