Cap: Exploiting Data Correlations to Improve the Performance and Endurance of SSD RAID

Parity-based RAID provides system-level fault tolerance. However, parity updates caused by small writes introduce lots of extra I/Os, degrading I/O performance and wearing SSDs out. It has been proposed to use Non-Volatile Memory (NVM) as a parity cache on an SSD RAID to postpone parity updates until the whole stripe has been updated. However, this often fails because of skewed distribution of hot data chunks within a stripe. In real workloads, it is often difficult to achieve a full-stripe update even after a long delay. In this paper, we propose a Correlation aware parity caching scheme, called Cap, for SSD-based RAIDs. The key idea behind Cap is to periodically reconstruct correlated hot data chunks into a new stripe. Since these data chunks have a strong correlation, they tend to be updated together within a short time span. This co-update within a stripe more efficiently utilizes the parity cache to convert partial-stripe updates into a full-stripe update. We have implemented Cap on a RAID-5 SSD array in Linux Kernel 4.3. Experimental results show that Cap improves the I/O bandwidth by 54%~145% compared with the Linux software RAID. Compared with the state-of-the-art parity caching scheme PPC, Cap improves the I/O bandwidth by 14%~31%.

[1]  Nong Xiao,et al.  CD-RAIS: Constrained dynamic striping in redundant array of independent SSDs , 2014, 2014 IEEE International Conference on Cluster Computing (CLUSTER).

[2]  Nong Xiao,et al.  R-Dedup: Content Aware Redundancy Management for SSD-Based RAID Systems , 2014, 2014 43rd International Conference on Parallel Processing.

[3]  Zhipeng Li,et al.  Grouping-Based Elastic Striping with Hotness Awareness for Improving SSD RAID Performance , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[4]  Yuanyuan Zhou,et al.  Association Proceedings of the Third USENIX Conference on File and Storage Technologies San Francisco , CA , USA March 31 – April 2 , 2004 , 2004 .

[5]  Eunji Lee,et al.  Reducing write amplification of flash storage through Cooperative Data Management with NVM , 2016, MSST.

[6]  Jie Xu,et al.  Improving Performance of TLC RRAM with Compression-Ratio-Aware Data Encoding , 2017, 2017 IEEE International Conference on Computer Design (ICCD).

[7]  Jun Yang,et al.  FPB: Fine-grained Power Budgeting to Improve Write Throughput of Multi-level Cell Phase Change Memory , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[8]  Lei Zhao,et al.  Speeding up crossbar resistive memory by exploiting in-memory data patterns , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[9]  Wei Wu,et al.  DEFT-Cache: A Cost-Effective and Highly Reliable SSD Cache for RAID Storage , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[10]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[11]  Hong Jiang,et al.  Elastic Data Compression with Improved Performance and Space Efficiency for Flash-Based Storage Systems , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[12]  Hong Jiang,et al.  Exploring and Exploiting the Multilevel Parallelism Inside SSDs for Improved Performance and Endurance , 2013, IEEE Transactions on Computers.

[13]  An-I Wang,et al.  The Composite-file File System: Decoupling the One-to-One Mapping of Files and Metadata for Better Performance , 2016, FAST.

[14]  E. L. Miller,et al.  Building Flexible , Fault-Tolerant Flash-based Storage Systems , 2009 .

[15]  Ching-Che Chung,et al.  Partial Parity Cache and Data Cache Management Method to Improve the Performance of an SSD-Based RAID , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[16]  Jongmoo Choi,et al.  Improving SSD reliability with RAID via Elastic Striping and Anywhere Parity , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[17]  Dongkun Shin,et al.  Flash-Aware RAID Techniques for Dependable and High-Performance Flash Memory SSD , 2011, IEEE Transactions on Computers.

[18]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[19]  G. Q Xiao,et al.  Mining Blocks' Association Rules for Disk Data Perfecting , 2015 .

[20]  Sam H. Noh,et al.  Towards SLO Complying SSDs Through OPS Isolation , 2015, FAST.

[21]  Sungwoo Hong,et al.  Mining-based File Caching in a Hybrid Storage System , 2014, J. Inf. Sci. Eng..