暂无分享,去创建一个
ROOT provides an flexible format used throughout the HEP community. The number of use cases - from an archival data format to end-stage analysis - has required a number of tradeoffs to be exposed to the user. For example, a high "compression level" in the traditional DEFLATE algorithm will result in a smaller file (saving disk space) at the cost of slower decompression (costing CPU time when read). At the scale of the LHC experiment, poor design choices can result in terabytes of wasted space or wasted CPU time. We explore and attempt to quantify some of these tradeoffs. Specifically, we explore: the use of alternate compressing algorithms to optimize for read performance; an alternate method of compressing individual events to allow efficient random access; and a new approach to whole-file compression. Quantitative results are given, as well as guidance on how to make compression decisions for different use cases.
[1] 12th USENIX Symposium on Networked Systems Design and Implementation, NSDI 15, Oakland, CA, USA, May 4-6, 2015 , 2015, NSDI.
[2] Abraham Lempel,et al. A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.
[3] Peter Deutsch,et al. ZLIB Compressed Data Format Specification version 3.3 , 1996, RFC.
[4] F. Rademakers,et al. ROOT — An object oriented data analysis framework , 1997 .