On the Data Expansion of the Huffman Compression Algorithm

While compressing a file with a Huffman code, it is possible that the size of the file grows temporarily. This happens when the source letters with low frequencies (to which long codewords are assigned) are encoded first. The maximum data expansion is the average growth in bits per source letter resulting from the encoding of a source letter with a long codeword. It is a measure of the worst case temporary growth of the file. In this paper we study the maximum data expansion of Huffman codes. We provide some new properties of the maximum data expansion δ of Huffman codes and using these properties we prove that δ < 1.256.

[1]  Inder Jeet Taneja,et al.  Bounds on the redundancy of Huffman codes , 1986, IEEE Trans. Inf. Theory.

[2]  Alfredo De Santis,et al.  On Lower Bounds for the Redundancy of Optimal Codes , 1998, Des. Codes Cryptogr..

[3]  Michelle Effros,et al.  Data expansion with Huffman codes , 1995, Proceedings of 1995 IEEE International Symposium on Information Theory.

[4]  Alfredo De Santis,et al.  A new bound for the data expansion of Huffman codes , 1997, IEEE Trans. Inf. Theory.

[5]  Alfredo De Santis,et al.  A note on D-ary Huffman codes , 1991, IEEE Trans. Inf. Theory.

[6]  B. V. K. Vijaya Kumar,et al.  On the average codeword length of optimal binary codes for extended sources , 1987, IEEE Trans. Inf. Theory.

[7]  C. Q. Lee,et al.  The Computer Journal , 1958, Nature.

[8]  Alfredo De Santis,et al.  New bounds on the redundancy of Huffman codes , 1991, IEEE Trans. Inf. Theory.

[9]  Alfredo De Santis,et al.  On the Redundancy Achieved by Huffman Codes , 1996, Inf. Sci..

[10]  David A. Huffman,et al.  A method for the construction of minimum-redundancy codes , 1952, Proceedings of the IRE.

[11]  Alfredo De Santis,et al.  Tight upper bounds on the redundancy of Huffman codes , 1989, IEEE Trans. Inf. Theory.