On the Usefulness of Fibonacci Compression Codes

Recent publications advocate the use of various variable length codes for which each codeword consists of an integral number of bytes in compression applications using large alphabets. This paper shows that another tradeoff with similar properties can be obtained by Fibonacci codes. These are fixed codeword sets, using binary representations of integers based on Fibonacci numbers of order m ≥ 2. Fibonacci codes have been used before, and this paper extends previous work presenting several novel features. In particular, the compression efficiency is analyzed and compared to that of dense codes, and various table-driven decoding routines are suggested.

[1]  Shmuel Tomi Klein,et al.  Robust Universal Complete Codes for Transmission and Compression , 1996, Discret. Appl. Math..

[2]  Daniel S. Hirschberg,et al.  Data compression , 1987, CSUR.

[3]  Gonzalo Navarro,et al.  Efficiently decodable and searchable natural language adaptive compression , 2005, SIGIR '05.

[4]  Shmuel Tomi Klein,et al.  Pattern matching in Huffman encoded texts , 2001, Proceedings DCC 2001. Data Compression Conference.

[5]  Gonzalo Navarro,et al.  Lightweight natural language text compression , 2006, Information Retrieval.

[6]  Shmuel Tomi Klein,et al.  A Systematic Approach to Compressing a Full-Text Retrieval System , 1992, Inf. Process. Manag..

[7]  Alberto Apostolico,et al.  Robust transmission of unbounded strings using Fibonacci representations , 1987, IEEE Trans. Inf. Theory.

[8]  J. Véronis,et al.  Evaluation of parallel text alignment systems The ARCADE project , 2000 .

[9]  Mikkel Thorup,et al.  String matching in Lempel-Ziv compressed strings , 1995, STOC '95.

[10]  Alistair Moffat,et al.  Text Compression for Dynamic Document Databases , 1997, IEEE Trans. Knowl. Data Eng..

[11]  共立出版株式会社 コンピュータ・サイエンス : ACM computing surveys , 1978 .

[12]  Mikkel Thorup,et al.  String Matching in Lempel—Ziv Compressed Strings , 1998, Algorithmica.

[13]  Gonzalo Navarro,et al.  FM-KZ: An even simpler alphabet-independent FM-index , 2006, Stringology.

[14]  Shmuel Tomi Klein,et al.  Efficient variants of Huffman codes in high level languages , 1985, SIGIR '85.

[15]  Marek Tomasz Biskup Guaranteed Synchronization of Huffman Codes , 2008, Data Compression Conference (dcc 2008).

[16]  Shmuel Tomi Klein,et al.  Skeleton Trees for the Efficient Decoding of Huffman Encoded Texts , 2000, Information Retrieval.

[17]  Shmuel Tomi Klein,et al.  Accelerating Boyer-Moore searches on binary texts , 2009, Theor. Comput. Sci..

[18]  Shmuel Tomi Klein,et al.  Fast decoding of prefix encoded texts , 2005, Data Compression Conference.

[19]  Gonzalo Navarro,et al.  An Efficient Compression Code for Text Databases , 2003, ECIR.

[20]  Udi Manber,et al.  A text compression scheme that allows fast searching directly in the compressed file , 1994, TOIS.

[21]  Alistair Moffat,et al.  Word‐based text compression , 1989, Softw. Pract. Exp..

[22]  Shmuel Tomi Klein,et al.  Is Huffman coding dead? , 1993, Computing.

[23]  G. Zipf,et al.  The Psycho-Biology of Language , 1936 .

[24]  Ricardo A. Baeza-Yates,et al.  Fast and flexible word searching on compressed text , 2000, TOIS.

[25]  Mike Liddell,et al.  Decoding prefix codes , 2006, Softw. Pract. Exp..

[26]  Andrzej Sieminski,et al.  Fast Decoding of the Huffman Codes , 1988, Inf. Process. Lett..

[27]  Robert S. Boyer,et al.  A fast string searching algorithm , 1977, CACM.

[28]  Gonzalo Navarro,et al.  (S, C)-Dense Coding: An Optimized Compression Code for Natural Language Text Databases , 2003, SPIRE.

[29]  C. Q. Lee,et al.  The Computer Journal , 1958, Nature.