Error Analysis of ZFP Compression for Floating-Point Data

Compression of floating-point data will play an important role in high-performance computing as data bandwidth and storage become dominant costs. Lossy compression of floating-point data is powerful, but theoretical results are needed to bound its errors when used to store look-up tables, simulation results, or even the solution state during the computation. \black{In this paper, we analyze the round-off error introduced by ZFP, a %state-of-the-art lossy compression algorithm.} The stopping criteria for ZFP depends on the compression mode specified by the user; either fixed rate, fixed accuracy, or fixed precision [P. Lindstrom, Fixed-rate compressed floating-point arrays, IEEE Transactions on Visualization and Computer Graphics, 2014]. While most of our discussion is focused on the fixed precision mode of ZFP, we establish a bound on the error introduced by all three compression modes. In order to tightly capture the error, we first introduce a vector space that allows us to work with binary representations of components. Under this vector space, we define operators that implement each step of the ZFP compression and decompression to establish a bound on the error caused by ZFP. To conclude, numerical tests are provided to demonstrate the accuracy of the established bounds.

[1]  Donald Ervin Knuth,et al.  The Art of Computer Programming, Volume II: Seminumerical Algorithms , 1970 .

[2]  Peter Lancaster,et al.  Norms on direct sums and tensor products , 1972 .

[3]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[4]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[5]  Terry A. Welch,et al.  A Technique for High-Performance Data Compression , 1984, Computer.

[6]  Peter Deutsch,et al.  DEFLATE Compressed Data Format Specification version 1.3 , 1996, RFC.

[7]  R.H. Dennard,et al.  Design Of Ion-implanted MOSFET's with Very Small Physical Dimensions , 1974, Proceedings of the IEEE.

[8]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[9]  Nicholas J. Higham,et al.  Accuracy and stability of numerical algorithms, Second Edition , 2002 .

[10]  Martin Isenburg,et al.  Fast and Efficient Compression of Floating-Point Data , 2006, IEEE Transactions on Visualization and Computer Graphics.

[11]  Martin Burtscher,et al.  Fast lossless compression of scientific floating-point data , 2006, Data Compression Conference (DCC'06).

[12]  Andrew W. Cook,et al.  Reynolds number effects on Rayleigh–Taylor instability with possible implications for type Ia supernovae , 2006 .

[13]  Hwa-Nien Yu,et al.  Design Of Ion-implanted MOSFET's with Very Small Physical Dimensions , 1974, Proceedings of the IEEE.

[14]  Abhijit Mitra,et al.  On Finite Wordlength Properties of Block-Floating-Point Arithmetic , 2008 .

[15]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[16]  Andrew A. Chien,et al.  The future of microprocessors , 2011, Commun. ACM.

[17]  Peter Lindstrom,et al.  Assessing the effects of data compression in simulations using physically motivated metrics , 2013, SC.

[18]  Peter Lindstrom,et al.  Fixed-Rate Compressed Floating-Point Arrays , 2014, IEEE Transactions on Visualization and Computer Graphics.

[19]  Francesco De Simone,et al.  Evaluating lossy data compression on climate simulation data within a large ensemble , 2016, Geoscientific Model Development.

[20]  Franck Cappello,et al.  Fast Error-Bounded Lossy HPC Data Compression with SZ , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).