SWANS: An Interdisk Wear-Leveling Strategy for RAID-0 Structured SSD Arrays

NAND flash memory–based solid state disks (SSDs) have been widely used in enterprise servers. However, flash memory has limited write endurance, as a block becomes unreliable after a finite number of program/erase cycles. Existing wear-leveling techniques are essentially intradisk data distribution schemes, as they can only even wear out across the flash medium within a single SSD. When multiple SSDs are organized in an array manner in server applications, an interdisk wear-leveling technique, which can ensure a uniform wear-out distribution across SSDs, is much needed. In this article, we propose a novel SSD-array level wear-leveling strategy called SWANS (<u>S</u>moothing <u>W</u>ear <u>A</u>cross <u>N</u> <u>S</u>SDs) for an SSD array structured in a RAID-0 format, which is frequently used in server applications. SWANS dynamically monitors and balances write distributions across SSDs in an intelligent way. Further, to evaluate its effectiveness, we build an SSD array simulator on top of a validated single SSD simulator. Next, SWANS is implemented in its array controller. Comprehensive experiments with real-world traces show that SWANS decreases the standard deviation of writes across SSDs on average by 16.7x. The gap in the total bytes written between the most written SSD and the least written SSD in an 8-SSD array shrinks at least 1.3x.

[1]  Xiaodong Zhang,et al.  Understanding intrinsic characteristics and system implications of flash memory based solid state drives , 2009, SIGMETRICS '09.

[2]  Dahlia Malkhi,et al.  CORFU: A distributed shared log , 2013, TOCS.

[3]  Dan Feng,et al.  Achieving page-mapping FTL performance at block-mapping FTL cost by hiding address translation , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[4]  Youngjae Kim,et al.  FlashSim: A Simulator for NAND Flash-Based Solid-State Drives , 2009, 2009 First International Conference on Advances in System Simulation.

[5]  Asim Kadav,et al.  Differential RAID: rethinking RAID for SSD reliability , 2010, OPSR.

[6]  David Hung-Chang Du,et al.  Rejuvenator: A static wear leveling algorithm for NAND flash memory with minimized overhead , 2011, 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST).

[7]  Antony I. T. Rowstron,et al.  Migrating server storage to SSDs: analysis of tradeoffs , 2009, EuroSys '09.

[8]  Li-Pin Chang,et al.  Design and implementation of an efficient wear-leveling algorithm for solid-state-disk microcontrollers , 2009, TODE.

[9]  Sang-Won Lee,et al.  A log buffer-based flash translation layer using fully-associative sector translation , 2007, TECS.

[10]  Daniel M. Dias,et al.  A modeling study of the TPC-C benchmark , 1993, SIGMOD '93.

[11]  Bianca Schroeder,et al.  Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You? , 2007, FAST.

[12]  Roberto Bez,et al.  Introduction to flash memory , 2003, Proc. IEEE.

[13]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[14]  Gregory R. Ganger,et al.  The DiskSim Simulation Environment Version 4.0 Reference Manual (CMU-PDL-08-101) , 1998 .

[15]  Youngjae Kim,et al.  DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings , 2009, ASPLOS.

[16]  Sandeep K. S. Gupta,et al.  DASH: a Recipe for a Flash-based Data Intensive Supercomputer , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[17]  Peter G. Harrison,et al.  Investigating Flash memory wear levelling and execution modes , 2009, 2009 International Symposium on Performance Evaluation of Computer & Telecommunication Systems.

[18]  Michael Isard,et al.  A design for high-performance flash disks , 2007, OPSR.

[19]  Hong Jiang,et al.  Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity , 2011, ICS '11.

[20]  Paul H. Siegel,et al.  Storage Coding for Wear Leveling in Flash Memories , 2009, IEEE Transactions on Information Theory.

[21]  Jim Gray,et al.  Parity Striping of Disk Arrays: Low-Cost Reliable Storage with Acceptable Throughput , 1990, VLDB.

[22]  Renhai Chen,et al.  DHeating: Dispersed heating repair for self-healing NAND flash memory , 2013, 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[23]  Jim Gray,et al.  Flash Disk Opportunity for Server Applications , 2008, ACM Queue.

[24]  Eduardo Pinheiro,et al.  Failure Trends in a Large Disk Drive Population , 2007, FAST.

[25]  Heeseung Jo,et al.  A group-based wear-leveling algorithm for large-capacity flash memory storage systems , 2007, CASES '07.

[26]  Tong Zhang,et al.  Exploiting Heat-Accelerated Flash Memory Wear-Out Recovery to Enable Self-Healing SSDs , 2011, HotStorage.

[27]  Sang-Won Lee,et al.  Advances in flash memory SSD technology for enterprise database applications , 2009, SIGMOD Conference.

[28]  Peter Desnoyers,et al.  Write Endurance in Flash Drives: Measurements and Analysis , 2010, FAST.

[29]  Ludmila Cherkasova,et al.  Analysis of enterprise media server workloads: access patterns, locality, content evolution, and rates of change , 2004, IEEE/ACM Transactions on Networking.

[30]  Hong Jiang,et al.  HPDA: A hybrid parity-based disk array for enhanced performance and reliability , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[31]  Dahlia Malkhi,et al.  CORFU: A Shared Log Design for Flash Clusters , 2012, NSDI.