The Harey Tortoise: Managing Heterogeneous Write Performance in SSDs

Recent years have witnessed significant gains in the adoption of flash technology due to increases in bit density, enabling higher capacities and lower prices. Unfortunately, these improvements come at a significant cost to performance with trends pointing toward worst-case flash program latencies on par with disk writes. We extend a conventional flash translation layer to schedule flash program operations to flash pages based on the operations' performance needs and the pages' performance characteristics. We then develop policies to improve performance in two scenarios: First, we improve peak performance for latency-critical operations of short bursts of intensive activity by 36%. Second, we realize steady-state bandwidth improvements of up to 95% by rate-matching garbage collection performance and external access performance.

[1]  Joonwon Lee,et al.  Exploiting Internal Parallelism of Flash-based SSDs , 2010, IEEE Computer Architecture Letters.

[2]  Yeong-Taek Lee,et al.  A Zeroing Cell-to-Cell Interference Page Architecture With Temporary LSB Storing and Parallel MSB Program Scheme for MLC NAND Flash Memories , 2008, IEEE Journal of Solid-State Circuits.

[3]  Hyojun Kim,et al.  BPLRU: A Buffer Management Scheme for Improving Random Writes in Flash Storage , 2008, FAST.

[4]  Michael Isard,et al.  A design for high-performance flash disks , 2007, OPSR.

[5]  Zili Shao,et al.  MNFTL: An efficient flash translation layer for MLC NAND flash memory storage systems , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[6]  Paul H. Siegel,et al.  Characterizing flash memory: Anomalies, observations, and applications , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[7]  Trevor N. Mudge,et al.  Improving NAND Flash Based Disk Caches , 2008, 2008 International Symposium on Computer Architecture.

[8]  Lara Dolecek,et al.  Tackling intracell variability in TLC Flash through tensor product codes , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[9]  Massimo Rossini,et al.  A 3bit/cell 32Gb NAND flash memory at 34nm with 6MB/s program throughput and with dynamic 2b/cell blocks configuration mode for a program throughput increase up to 13MB/s , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[10]  Bharadwaj Veeravalli,et al.  WAFTL: A workload adaptive flash translation layer with data partition , 2011, 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST).

[11]  Tian Luo,et al.  CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of Flash Memory based Solid State Drives , 2011, FAST.

[12]  Yitzhak Birk,et al.  Constrained Flash memory programming , 2011, 2011 IEEE International Symposium on Information Theory Proceedings.

[13]  Xiaodong Zhang,et al.  Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[14]  Yun Tian,et al.  Improving write performance by enhancing internal parallelism of Solid State Drives , 2012, 2012 IEEE 31st International Performance Computing and Communications Conference (IPCCC).

[15]  Steven Swanson,et al.  Reliably Erasing Data from Flash-Based Solid State Drives , 2011, FAST.

[16]  Steven Swanson,et al.  The bleak future of NAND flash memory , 2012, FAST.

[17]  Hsie-Chia Chang,et al.  A 45nm 6b/cell charge-trapping flash memory using LDPC-based ECC and drift-immune soft-sensing engine , 2013, 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers.

[18]  Steven Swanson,et al.  Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications , 2009, ASPLOS.

[19]  Yi-Hsuan Hsiao,et al.  Radically extending the cycling endurance of Flash memory (to > 100M Cycles) by using built-in thermal annealing to self-heal the stress-induced damage , 2012, 2012 International Electron Devices Meeting.

[20]  Da-Wei Chang,et al.  ROSE: A Novel Flash Translation Layer for NAND Flash Memory Based on Hybrid Address Translation , 2011, IEEE Transactions on Computers.

[21]  Youngjae Kim,et al.  DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings , 2009, ASPLOS.

[22]  Shi Bai,et al.  A parallel flash translation layer based on page group-block hybrid-mapping method , 2012, IEEE Transactions on Consumer Electronics.

[23]  Dongkun Shin,et al.  KAST: K-associative sector translation for NAND flash memory in real-time systems , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[24]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[25]  Eui-Young Chung,et al.  Design and analysis of flash translation layers for multi-channel NAND flash-based storage devices , 2009, IEEE Transactions on Consumer Electronics.