Pensieve: a Machine Learning Assisted SSD Layer for Extending the Lifetime

As the capacity per unit cost dropping, flash-based SSDs become popular in various computing scenarios. However, the restricted program-erase cycles still severely limit cost-effectiveness of flash-based storage solutions. This paper proposes Pensieve, a machine-learning assisted SSD firmware layer that transparently helps reduce the demand for programs and erases. Pensieve efficiently classifies writing data into different compression categories without hints from software systems. Data with the same category may use a shared dictionary to compress the content, allowing Pensieve to further avoid duplications. As Pensieve does not require any modification in the software stack, Pensieve is compatible with existing applications, file systems and operating systems. With modern SSD architectures, implementing a Pensieve-compliant SSD also requires no additional hardware, providing a drop-in upgrade for existing storage systems. The experimental result on our prototype Pensieve SSD shows that Pensieve can reduce the amount of program operations by 19%, while delivering competitive performance.

[1]  Angelos Bilas,et al.  Using transparent compression to improve SSD-based I/O caches , 2010, EuroSys '10.

[2]  Xubin He,et al.  Delta-FTL: improving SSD lifetime via exploiting content locality , 2012, EuroSys '12.

[3]  Sungjin Lee,et al.  Improving performance and lifetime of solid-state drives using hardware-accelerated compression , 2011, IEEE Transactions on Consumer Electronics.

[4]  Jin-Soo Kim,et al.  Deduplication with Block-Level Content-Aware Chunking for Solid State Drives (SSDs) , 2013, 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing.

[5]  Joonwon Lee,et al.  DRACO: A Deduplicating FTL for Tangible Extra Capacity , 2015, IEEE Computer Architecture Letters.

[6]  Jin Li,et al.  ChunkStash: Speeding Up Inline Storage Deduplication Using Flash Memory , 2010, USENIX Annual Technical Conference.

[7]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[8]  S. Jones,et al.  Design and performance of a main memory hardware data compressor , 1996, Proceedings of EUROMICRO 96. 22nd Euromicro Conference. Beyond 2000: Hardware and Software Design Strategies.

[9]  Youngjae Kim,et al.  DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings , 2009, ASPLOS.

[10]  Jinyoung Lee,et al.  Biscuit: A Framework for Near-Data Processing of Big Data Workloads , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[11]  Xue Liu,et al.  Smart in-network deduplication for storage-aware SDN , 2013, SIGCOMM.

[12]  Ethan L. Miller,et al.  Adding aggressive error correction to a high-performance compressing flash file system , 2009, EMSOFT '09.

[13]  Dongwook Kim,et al.  zf-FTL: a zero-free flash translation layer , 2016, SAC.

[14]  Erez Zadok,et al.  Filebench: A Flexible Framework for File System Benchmarking , 2016, login Usenix Mag..

[15]  David D. Chambliss,et al.  Mixing Deduplication and Compression on Active Data Sets , 2011, 2011 Data Compression Conference.

[16]  Sivan Toledo,et al.  Compression and SSDs: Where and How? , 2014, INFLOW.

[17]  Kern Koh,et al.  A flash compression layer for SmartMedia card systems , 2004, IEEE Transactions on Consumer Electronics.

[18]  Paul H. Siegel,et al.  Characterizing flash memory: Anomalies, observations, and applications , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[19]  Gang Wang,et al.  Lazy exact deduplication , 2016, 2016 32nd Symposium on Mass Storage Systems and Technologies (MSST).

[20]  Angelos Bilas,et al.  ZBD: Using Transparent Compression at the Block Level to Increase Storage Space Efficiency , 2010, 2010 International Workshop on Storage Network Architecture and Parallel I/Os.

[21]  Jin-Soo Kim,et al.  zFTL: power-efficient data compression support for NAND flash-based consumer electronics devices , 2011, IEEE Transactions on Consumer Electronics.

[22]  David J. DeWitt,et al.  Query processing on smart SSDs: opportunities and challenges , 2013, SIGMOD '13.

[23]  Kern Koh,et al.  LeCramFS: an efficient compressed file system for flash-based portable consumer devices , 2007, IEEE Transactions on Consumer Electronics.

[24]  Yannis Papakonstantinou,et al.  HippogriffDB: Balancing I/O and GPU Bandwidth in Big Data Analytics , 2016, Proc. VLDB Endow..

[25]  David Woodhouse,et al.  JFFS : The Journalling Flash File System , 2001 .

[26]  Danny Harnik,et al.  A Fast Implementation of Deflate , 2014, 2014 Data Compression Conference.

[27]  Cheng Li,et al.  Nitro: A Capacity-Optimized SSD Cache for Primary Storage , 2014, USENIX Annual Technical Conference.

[28]  Erez Zadok,et al.  Energy and performance evaluation of lossless file data compression on server systems , 2009, SYSTOR '09.

[29]  Tian Luo,et al.  CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of Flash Memory based Solid State Drives , 2011, FAST.

[30]  Steven Swanson,et al.  Morpheus: Creating Application Objects Efficiently for Heterogeneous Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[31]  Jongmoo Choi,et al.  Deduplication in SSDs: Model and quantitative analysis , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).