Run-Length Base-Delta Encoding for High-Speed Compression

In modern supercomputers, nodes are connected by networking hardware capable of up to 40 Gb/s. Data compression could allow for even higher effective bandwidth. However, data compression for such systems requires a unique tradeoff between the compression rate delivered by the compressing scheme and the speed of compression/decompression. While traditional software compression techniques may deliver high compression rates, they cannot maintain the high compression/decompression throughput that is needed. We present Run-Length Base-Delta (RLBD) encoding, a software compression format and algorithm that delivers a highspeed compression/decompression that is suitable for data transfers at up to 40GbE. RLBD can be implemented in CPUs or in parallel accelerator devices to improve network throughput by up to 57% while transmitting data from real datasets.

[1]  Steve R. Gunn,et al.  Result Analysis of the NIPS 2003 Feature Selection Challenge , 2004, NIPS.

[2]  Manuel Ujaldon,et al.  CUDA Achievements and GPU Challenges Ahead , 2016, AMDO.

[3]  P. Mell,et al.  The NIST Definition of Cloud Computing , 2011 .

[4]  Onur Mutlu,et al.  Base-delta-immediate compression: Practical data compression for on-chip caches , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[5]  Rolf Hempel,et al.  The MPI Standard for Message Passing , 1994, HPCN.

[6]  Stephen Curial,et al.  MPADS: memory-pooling-assisted data splitting , 2008, ISMM '08.

[7]  Hugo Fuks,et al.  Qualitative activity recognition of weight lifting exercises , 2013, AH.

[8]  U. C. Bureau,et al.  Census of Population and Housing , 1993 .

[9]  Won Woo Ro,et al.  Warped-Compression: Enabling power efficient GPUs through register compression , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[10]  Jyrki Alakuijala,et al.  Brotli Compressed Data Format , 2016, RFC.

[11]  Avinash Pandey,et al.  Compression Acceleration Using GPGPU , 2016, 2016 IEEE 23rd International Conference on High Performance Computing Workshops (HiPCW).

[12]  D. Martin Swany,et al.  Pipelined Parallel LZSS for Streaming Data Compression on GPGPUs , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.

[13]  Kenneth A. Ross,et al.  Massively-Parallel Lossless Data Decompression , 2016, 2016 45th International Conference on Parallel Processing (ICPP).

[14]  Matthew Poremba,et al.  NoΔ: Leveraging delta compression for end-to-end memory access in NoC based multicores , 2014, 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC).

[15]  Jyrki Alakuijala,et al.  Gipfeli - High Speed Compression Algorithm , 2012, 2012 Data Compression Conference.

[16]  Robert B. Ross,et al.  Improving I/O Forwarding Throughput with Data Compression , 2011, 2011 IEEE International Conference on Cluster Computing.

[17]  David Goldberg,et al.  What every computer scientist should know about floating-point arithmetic , 1991, CSUR.