PTrie: Data Structure for Compressing and Storing Sets via Prefix Sharing

Sets and their efficient implementation are fundamental in all of computer science, including model checking, where sets are used as the basic data structure for storing (encodings of) states during a state-space exploration. In the quest for fast and memory efficient methods for manipulating large sets, we present a novel data structure called PTrie for storing sets of binary strings of arbitrary length. The PTrie data structure distinguishes itself by compressing the stored elements while sharing the desirable key characteristics with conventional hash-based implementations, namely fast insertion and lookup operations. We provide the theoretical foundation of PTries, prove the correctness of their operations and conduct empirical studies analysing the performance of PTries for dealing with randomly generated binary strings as well as for state-space exploration of a large collection of Petri net models from the 2016 edition of the Model Checking Contest (MCC’16). We experimentally document that with a modest overhead in running time, a truly significant space-reduction can be achieved. Lastly, we provide an efficient implementation of the PTrie data structure under the GPL version 3 license, so that the technology is made available for memory-intensive applications such as model-checking tools.

[1]  Jirí Srba,et al.  TAPAAL and Reachability Analysis of P/T Nets , 2016, Trans. Petri Nets Other Model. Concurr..

[2]  Karsten Wolf Running LoLA 2.0 in a Model Checking Competition , 2016, Trans. Petri Nets Other Model. Concurr..

[3]  G. Gwehenberger Anwendung einer binären Verweiskettenmethode beim Aufbau von Listen / Use of a binary tree structure for processing files , 1968, Elektron. Rechenanlagen.

[4]  Edward Fredkin,et al.  Trie memory , 1960, Commun. ACM.

[5]  Donald R. Morrison,et al.  PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric , 1968, J. ACM.

[6]  Jirí Srba,et al.  TAPAAL 2.0: Integrated Development Environment for Timed-Arc Petri Nets , 2012, TACAS.

[7]  Hugh E. Williams,et al.  Burst tries: a fast, efficient data structure for string keys , 2002, TOIS.

[8]  Ranjan Sinha,et al.  HAT-Trie: A Cache-Conscious Trie-Based Data Structure For Strings , 2007, ACSC.

[9]  Sami Evangelista,et al.  Memory Efficient State Space Storage in Explicit Software Model Checking , 2005, SPIN.

[10]  Jason Evans April A Scalable Concurrent malloc(3) Implementation for FreeBSD , 2006 .

[11]  Phil Bagwell,et al.  Ideal Hash Trees , 2001 .

[12]  Kim G. Larsen,et al.  Memory Efficient Data Structures for Explicit Verification of Timed Systems , 2014, NASA Formal Methods.

[13]  Sofia Cassel,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 2012 .

[14]  Martin Odersky,et al.  Concurrent tries with efficient non-blocking snapshots , 2012, PPoPP '12.

[15]  Vladimír Still,et al.  Techniques for Memory-Efficient Model Checking of C and C++ Code , 2015, SEFM.

[16]  Jirí Srba,et al.  TAPAAL: Editor, Simulator and Verifier of Timed-Arc Petri Nets , 2009, ATVA.

[17]  Alfons Laarman,et al.  Parallel Recursive State Compression for Free , 2011, SPIN.

[18]  OderskyMartin,et al.  Concurrent tries with efficient non-blocking snapshots , 2012 .