Transparent Online Storage Compression at the Block-Level

In this work, we examine how transparent block-level compression in the I/O path can improve both the space efficiency and performance of online storage. We present ZBD, a block-layer driver that transparently compresses and decompresses data as they flow between the file-system and storage devices. Our system provides support for variable-size blocks, metadata caching, and persistence, as well as block allocation and cleanup. ZBD targets maintaining high performance, by mitigating compression and decompression overheads that can have a significant impact on performance by leveraging modern multicore CPUs through explicit work scheduling. We present two case-studies for compression. First, we examine how our approach can be used to increase the capacity of SSD-based caches, thus increasing their cost-effectiveness. Then, we examine how ZBD can improve the efficiency of online disk-based storage systems. We evaluate our approach in the Linux kernel on a commodity server with multicore CPUs, using PostMark, SPECsfs2008, TPC-C, and TPC-H. Preliminary results show that transparent online block-level compression is a viable option for improving effective storage capacity, it can improve I/O performance up to 80% by reducing I/O traffic and seek distance, and has a negative impact on performance, up to 34%, only when single-thread I/O latency is critical. In particular, for SSD-based caching, our results indicate that, in line with current technology trends, compressed caching trades off CPU utilization for performance and enhances SSD efficiency as a storage cache up to 99%.

[1]  Hyojun Kim,et al.  BPLRU: A Buffer Management Scheme for Improving Random Writes in Flash Storage , 2008, FAST.

[2]  Fred Douglis On the role of compression in distributed systems , 1992, EW 5.

[3]  Meikel Pöss,et al.  Data Compression in Oracle , 2003, VLDB.

[4]  Butler W. Lampson,et al.  On-line data compression in a log-structured file system , 1992, ASPLOS V.

[5]  John D. Davis,et al.  Block Management in Solid-State Devices , 2009, USENIX Annual Technical Conference.

[6]  Antony I. T. Rowstron,et al.  Migrating server storage to SSDs: analysis of tradeoffs , 2009, EuroSys '09.

[7]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[8]  BilasAngelos,et al.  Transparent Online Storage Compression at the Block-Level , 2012 .

[9]  Thomas R. Gross,et al.  Combining the concepts of compression and caching for a two-level filesystem , 1991, ASPLOS IV.

[10]  Angelos Bilas,et al.  ZBD: Using Transparent Compression at the Block Level to Increase Storage Space Efficiency , 2010, 2010 International Workshop on Storage Network Architecture and Parallel I/Os.

[11]  Udi Manber,et al.  Finding Similar Files in a Large File System , 1994, USENIX Winter.

[12]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[13]  Chinya V. Ravishankar,et al.  Block-Oriented Compression Techniques for Large Statistical Databases , 1997, IEEE Trans. Knowl. Data Eng..

[14]  Bruce Jacob,et al.  The performance of PC solid-state disks (SSDs) as a function of bandwidth, concurrency, device architecture, and system organization , 2009, ISCA '09.

[15]  G.G. Langdon,et al.  Data compression , 1988, IEEE Potentials.

[16]  Lei Yang,et al.  CRAMES: compressed RAM for embedded systems , 2005, 2005 Third IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'05).

[17]  Jörn Engel,et al.  LogFS-finally a scalable flash file system , 2005 .

[18]  Adam Leventhal,et al.  Flash storage memory , 2008, CACM.

[19]  Peter Deutsch,et al.  ZLIB Compressed Data Format Specification version 3.3 , 1996, RFC.

[20]  Jeffrey Katcher,et al.  PostMark: A New File System Benchmark , 1997 .

[21]  James A. Storer,et al.  Parallel algorithms for data compression , 1985, JACM.

[22]  Andrew W. Appel,et al.  Virtual memory primitives for user programs , 1991, ASPLOS IV.

[23]  Suresh Jagannathan,et al.  Improving duplicate elimination in storage systems , 2006, TOS.

[24]  Terry A. Welch,et al.  A Technique for High-Performance Data Compression , 1984, Computer.

[25]  Daniel S. Hirschberg,et al.  Data compression , 1987, CSUR.

[26]  Yannis Smaragdakis,et al.  The Case for Compressed Caching in Virtual Memory Systems , 1999, USENIX Annual Technical Conference, General Track.

[27]  David Woodhouse,et al.  JFFS : The Journalling Flash File System , 2001 .

[28]  Thomas F. Wenisch,et al.  PowerNap: eliminating server idle power , 2009, ASPLOS.

[29]  Angelos Bilas,et al.  Using transparent compression to improve SSD-based I/O caches , 2010, EuroSys '10.

[30]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[31]  Gordon V. Cormack,et al.  Data compression on a database system , 1985, CACM.

[32]  Fred Douglis,et al.  The Compression Cache: Using On-line Compression to Extend Physical Memory , 1993, USENIX Winter.

[33]  Trevor N. Mudge,et al.  FlashCache: a NAND flash memory file cache for low power web servers , 2006, CASES '06.

[34]  Luigi Rizzo A very fast algorithm for RAM compression , 1997, OPSR.

[35]  Jeremy H. BrownOctober A Survey of Modern File Compression Techniques , 2007 .

[36]  Kai Li,et al.  Avoiding the Disk Bottleneck in the Data Domain Deduplication File System , 2008, FAST.

[37]  Jae-Myung Kim,et al.  A case for flash memory ssd in enterprise database applications , 2008, SIGMOD Conference.