SSD caching to overcome small write problem of disk-based RAID in enterprise environments

Disk-based RAID still prevails in enterprise environments due to its cost-effectiveness, reliability, and maintainability. However, it suffers from parity update overhead, which is generally called small write problem that deteriorates performance significantly for small write requests. Targeting the parity update overhead, our design choice is employing Flash-based SSD cache upon disk-based RAID storage server. Particularly, we use single SSD, which can be purchased in consumer markets, as the caching device. By the way, an SSD has non-negligible failure rate and, thus, reliability may be compromised without appropriate measures to protect data from failure. To insure reliability upon failures while eliminating parity update overhead, we devise an SSD cache management scheme that we refer to as LeavO cache. The LeavO cache keeps not only new data but also old data in SSD cache to postpone parity updates in RAID storage until the old data are discarded for space recycling. By doing so, upon failures, lost data can be recovered with the old data and old parity or new data in the SSD cache. We implement the LeavO cache in a real Linux system and measure the performance of storage server with and without LeavO cache. Also, through mathematical analyses, we compare reliability of the LeavO cache with conventional RAID-0 and -5 configurations. Experimental results and mathematical analyses show that the LeavO cache effectively eliminates much of parity update overhead while providing reliability and maintainability comparable to conventional RAID configurations.

[1]  Qi Zhang,et al.  Characterization of storage workload traces from production Windows Servers , 2008, 2008 IEEE International Symposium on Workload Characterization.

[2]  Bianca Schroeder,et al.  Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You? , 2007, FAST.

[3]  Steve Byan,et al.  Mercury: Host-side flash caching for the data center , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[4]  Feng Chen,et al.  Hystor: making the best use of solid state drives in high performance storage systems , 2011, ICS '11.

[5]  Prasant Mohapatra,et al.  Performance study of RAID-5 disk arrays with data and parity cache , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[6]  Ming Zhao,et al.  Write policies for host-side flash caches , 2013, FAST.

[7]  Trevor N. Mudge,et al.  Improving NAND Flash Based Disk Caches , 2008, 2008 International Symposium on Computer Architecture.

[8]  Randy H. Katz,et al.  Introduction to redundant arrays of inexpensive disks (RAID) , 1989, Digest of Papers. COMPCON Spring 89. Thirty-Fourth IEEE Computer Society International Conference: Intellectual Leverage.

[9]  Dongkun Shin,et al.  Delayed partial parity scheme for reliable and high-performance flash memory SSD , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[10]  Carl Staelin,et al.  The HP AutoRAID hierarchical storage system , 1995, SOSP.

[11]  Daniel Stodolsky,et al.  Parity logging overcoming the small write problem in redundant disk arrays , 1993, ISCA '93.

[12]  Sam H. Noh,et al.  Striping and buffer caching for software RAID file systems in workstation clusters , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[13]  Mithuna Thottethodi,et al.  SieveStore: a highly-selective, ensemble-level disk cache for cost-performance , 2010, ISCA '10.