Energy efficient canonical huffman encoding

As data centers are increasingly focused on energy efficiency, it becomes important to develop low power implementations of the various applications that run on them. Data compression plays a critical role in data centers to mitigate storage and communication costs. This work focuses on building a low power, high performance implementation for canonical Huffman encoding. We develop a number of different hardware and software implementations targeting Xilinx Zynq FPGA, ARM Cortex-A9, and Intel Core i7. Despite its sequential nature, we show that our hardware accelerated implementation is substantially more energy efficient than both the ARM and Intel Core i7 implementations. When compared to highly optimized software running on the ARM processor, our hardware accelerated implementation has approximately 15 times more throughput with 10% higher power usage, resulting in an 8X benefit in energy efficiency (measured in encodings/Watt). Additionally, our hardware accelerated implementation is up to 80% faster and over 230 times more energy efficient than a highly optimized Core i7 implementation.

[1]  Ian H. Witten,et al.  Arithmetic coding for data compression , 1987, CACM.

[2]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[3]  Z. Aspar,et al.  Parallel Huffman decoder with an optimized look up table option on FPGA , 2000, 2000 TENCON Proceedings. Intelligent Systems and Technologies for the New Millennium (Cat. No.00CH37119).

[4]  Jason Cong,et al.  Bit-level optimization for high-level synthesis and FPGA-based acceleration , 2010, FPGA '10.

[5]  Chen-Yi Lee,et al.  A memory-based architecture for very-high-throughput variable length codec design , 1997, Proceedings of 1997 IEEE International Symposium on Circuits and Systems. Circuits and Systems in the Information Age ISCAS '97.

[6]  Shmuel Tomi Klein,et al.  Parallel Huffman Decoding with Applications to JPEG Files , 2003, Comput. J..

[7]  Amar Mukherjee,et al.  MARVLE: a VLSI chip for data compression using tree-based codes , 1993, IEEE Trans. Very Large Scale Integr. Syst..

[8]  Peter Deutsch,et al.  DEFLATE Compressed Data Format Specification version 1.3 , 1996, RFC.

[9]  Joan L. Mitchell,et al.  JPEG: Still Image Data Compression Standard , 1992 .

[10]  D. Huffman A Method for the Construction of Minimum-Redundancy Codes , 1952 .

[11]  W. Bishop,et al.  FPGA-Based Lossless Data Compression using Huffman and LZ77 Algorithms , 2007, 2007 Canadian Conference on Electrical and Computer Engineering.

[12]  Lingjia Tang,et al.  Whare-map: heterogeneity in "homogeneous" warehouse-scale computers , 2013, ISCA.

[13]  V. K. Prasanna,et al.  Area efficient VLSI architectures for Huffman coding , 1993 .

[14]  William H. Press,et al.  Numerical recipes in C , 2002 .

[15]  Shie-Jue Lee,et al.  A JPEG Chip for Image Compression and Decompression , 2003, J. VLSI Signal Process..