Random access to Fibonacci encoded files

A Wavelet tree is a data structure adjoined to a file that has been compressed by a variable length encoding, which allows direct access to the underlying file, resulting in the fact that the compressed file is not needed any more. We adapt, in this paper, the Wavelet tree to Fibonacci codes, so that in addition to supporting direct access to the Fibonacci encoded file, we also increase the compression savings when compared to the original Fibonacci compressed file. The improvements are achieved by means of a new pruning technique.

[1]  Hugh E. Williams,et al.  Compressing Integers for Fast File Access , 1999, Comput. J..

[2]  H. Kadner F. L. Bauer u. G. Goos, Informatik. Eine einführende Übersicht. Zweiter Teil. XII + 200 S. m. 70 Abb. Berlin/Heidelberg/New York 1971. Springer-Verlag. Preis brosch. DM 12,80 , 1972 .

[3]  Gustav Herdan,et al.  The advanced theory of language as choice and chance , 1968 .

[4]  Peter Elias,et al.  Universal codeword sets and representations of the integers , 1975, IEEE Trans. Inf. Theory.

[5]  Gonzalo Navarro,et al.  (S, C)-Dense Coding: An Optimized Compression Code for Natural Language Text Databases , 2003, SPIRE.

[6]  R. González,et al.  PRACTICAL IMPLEMENTATION OF RANK AND SELECT QUERIES , 2005 .

[7]  Shmuel Tomi Klein,et al.  Skeleton Trees for the Efficient Decoding of Huffman Encoded Texts , 2000, Information Retrieval.

[8]  Alberto Apostolico,et al.  Robust transmission of unbounded strings using Fibonacci representations , 1987, IEEE Trans. Inf. Theory.

[9]  Gonzalo Navarro,et al.  Reorganizing compressed text , 2008, SIGIR '08.

[10]  Dana Shapira,et al.  Adapting the Knuth-Morris-Pratt algorithm for pattern matching in Huffman encoded texts , 2006, Inf. Process. Manag..

[11]  H. F. Gaines,et al.  Cryptanalysis: A Study of Ciphers and Their Solution , 1956 .

[12]  H. S. Heaps,et al.  Information retrieval, computational and theoretical aspects , 1978 .

[13]  Aviezri S. Fraenkelf ALL ABOUT THE RESPONSA RETRIEVAL PROJECT—WHAT YOU ALWAYS WANTED TO KNOW BUT WERE AFRAID TO ASK , 2016 .

[14]  Shmuel Tomi Klein,et al.  On the Usefulness of Fibonacci Compression Codes , 2010, Comput. J..

[15]  Shmuel Tomi Klein,et al.  Robust Universal Complete Codes for Transmission and Compression , 1996, Discret. Appl. Math..

[16]  Samuel S. Wagstaff Cryptanalysis , 1999, Algorithms and Theory of Computation Handbook.

[17]  David Richard Clark,et al.  Compact pat trees , 1998 .

[18]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[19]  Giuseppe Ottaviano,et al.  The wavelet trie: maintaining an indexed sequence of strings in compressed space , 2012, PODS '12.

[20]  Donald R. Morrison,et al.  PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric , 1968, J. ACM.

[21]  Rajeev Raman,et al.  Succinct indexable dictionaries with applications to encoding k-ary trees and multisets , 2002, SODA '02.

[22]  Guy Jacobson,et al.  Space-efficient static trees and graphs , 1989, 30th Annual Symposium on Foundations of Computer Science.

[23]  Gonzalo Navarro,et al.  Alphabet Partitioning for Compressed Rank/Select and Applications , 2010, ISAAC.

[24]  J. Mixter Fast , 2012 .

[25]  M. Oguzhan Külekci Enhanced Variable-Length Codes: Improved Compression with Efficient Random Access , 2014, 2014 Data Compression Conference.

[26]  Gonzalo Navarro,et al.  DACs: Bringing direct access to variable-length codes , 2013, Inf. Process. Manag..

[27]  Shmuel Tomi Klein,et al.  A Systematic Approach to Compressing a Full-Text Retrieval System , 1992, Inf. Process. Manag..

[28]  Kunihiko Sadakane,et al.  Practical Entropy-Compressed Rank/Select Dictionary , 2006, ALENEX.

[29]  Roberto Grossi,et al.  High-order entropy-compressed text indexes , 2003, SODA '03.

[30]  Gonzalo Navarro,et al.  Fast, Small, Simple Rank/Select on Bitmaps , 2012, SEA.