Chameleon: An Adaptive Wear Balancer for Flash Clusters

NAND flash-based Solid State Devices (SSDs) offer the desirable features of high performance, energy efficiency, and fast growing capacity. Thus, the use of SSDs is increasing in distributed storage systems. A key obstacle in this context is that the natural unbalance in distributed I/O workloads can result in wear imbalance across the SSDs in a distributed setting. This, in turn can have significant impact on the reliability, performance, and lifetime of the storage deployment. Extant load balancers for storage systems do not consider SSD wear imbalance when placing data, as the main design goal of such balancers is to extract higher performance. Consequently, data migration is the only common technique for tackling wear imbalance, where existing data is moved from highly loaded servers to the least loaded ones. In this paper, we explore an innovative holistic approach, Chameleon, that employs data redundancy techniques such as replication and erasure-coding, coupled with endurance-aware write offloading, to mitigate wear level imbalance in distributed SSD-based storage. Chameleon aims to balance the wear among different flash servers while meeting desirable objectives of: extending life of flash servers; improving I/O performance; and avoiding bottlenecks. Evaluation with a 50 node SSD cluster shows that Chameleon reduces the wear distribution deviation by 81% while improving the write performance by up to 33%.

[1]  Javier González,et al.  LightNVM: The Linux Open-Channel SSD Subsystem , 2017, FAST.

[2]  Ali Raza Butt,et al.  High performance in-memory caching through flexible fine-grained services , 2013, SoCC.

[3]  Ali Raza Butt,et al.  CAST: Tiering Storage for Data Analytics in the Cloud , 2015, HPDC.

[4]  Peter Desnoyers,et al.  Erasing Belady's Limitations: In Search of Flash Cache Offline Optimality , 2016, USENIX Annual Technical Conference.

[5]  Xavier Jimenez,et al.  Wear unleveling: improving NAND flash lifetime by balancing page endurance , 2014, FAST.

[6]  Sang-goo Lee,et al.  A new flash memory management for flash storage system , 1999, Proceedings. Twenty-Third Annual International Computer Software and Applications Conference (Cat. No.99CB37032).

[7]  Ali Raza Butt,et al.  hatS: A Heterogeneity-Aware Tiered Storage for Hadoop , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[8]  Amar Phanishayee,et al.  FAWN: a fast array of wimpy nodes , 2009, SOSP '09.

[9]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[10]  Mario Blaum,et al.  A Tale of Two Erasure Codes in HDFS , 2015, FAST.

[11]  Hai Huang,et al.  ClusterOn: Building Highly Configurable and Reusable Clustered Data Services Using Simple Data Nodes , 2016, HotStorage.

[12]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[13]  Tei-Wei Kuo,et al.  An efficient management scheme for large-scale flash-memory storage systems , 2004, SAC '04.

[14]  Heeseung Jo,et al.  A group-based wear-leveling algorithm for large-capacity flash memory storage systems , 2007, CASES '07.

[15]  Youngjae Kim,et al.  FlashSim: A Simulator for NAND Flash-Based Solid-State Drives , 2009, 2009 First International Conference on Advances in System Simulation.

[16]  Ali Raza Butt,et al.  An in-memory object caching framework with adaptive load balancing , 2015, EuroSys.

[17]  Ali Anwar,et al.  Taming the cloud object storage with MOS , 2015, PDSW '15.

[18]  Li-Pin Chang,et al.  On efficient wear leveling for large-scale flash-memory storage systems , 2007, SAC '07.

[19]  Lavanya Ramakrishnan,et al.  AnalyzeThis: an analysis workflow-aware storage system , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[20]  Dahlia Malkhi,et al.  CORFU: A distributed shared log , 2013, TOCS.

[21]  Osamu Tatebe,et al.  Server-Side Efficient Parity Generation for Cluster-Wide RAID System , 2015, 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom).

[22]  Steven Swanson,et al.  QuickSAN: a storage area network for fast, distributed, solid state disks , 2013, ISCA.

[23]  Ali Raza Butt,et al.  Pricing Games for Hybrid Object Stores in the Cloud: Provider vs. Tenant , 2015, HotStorage.

[24]  Jin Li,et al.  FlashStore , 2010, Proc. VLDB Endow..

[25]  Peter Desnoyers,et al.  Write Endurance in Flash Drives: Measurements and Analysis , 2010, FAST.

[26]  Jihong Kim,et al.  Application-Managed Flash , 2016, FAST.

[27]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[28]  Xavier Jimenez,et al.  Phoenix: Reviving MLC blocks as SLC to extend NAND flash devices lifetime , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[29]  Guoliang Li,et al.  LazyFTL: a page-level flash translation layer optimized for NAND flash memory , 2011, SIGMOD '11.

[30]  Shivakumar Venkataraman,et al.  The TickerTAIP parallel RAID architecture , 1993, ISCA '93.

[31]  Mahesh Balakrishnan,et al.  Extending SSD Lifetimes with Disk-Based Write Caches , 2010, FAST.

[32]  Wei Wang,et al.  EDM: An Endurance-Aware Data Migration Scheme for Load Balancing in SSD Storage Clusters , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[33]  Sungjin Lee,et al.  BlueDBM: Distributed Flash Storage for Big Data Analytics , 2016, ACM Trans. Comput. Syst..

[34]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[35]  Ali Anwar,et al.  MOS: Workload-aware Elasticity for Cloud Object Stores , 2016, HPDC.

[36]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[37]  Abhinav Sharma,et al.  SWANS: An Interdisk Wear-Leveling Strategy for RAID-0 Structured SSD Arrays , 2016, TOS.

[38]  E. L. Miller,et al.  Building Flexible , Fault-Tolerant Flash-based Storage Systems , 2009 .

[39]  David Hung-Chang Du,et al.  Rejuvenator: A static wear leveling algorithm for NAND flash memory with minimized overhead , 2011, 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST).