Offline and Online Algorithms for SSD Management

Flash-based solid state drives (SSDs) have gained a central role in the infrastructure of large-scale datacenters, as well as in commodity servers and personal devices. The main limitation of flash media is its inability to support update-in-place: after data has been written to a physical location, it has to be erased before new data can be written to it. Moreover, SSDs support read and write operations in granularity of pages, while erasures are performed on entire blocks, which often contain hundreds of pages. When erasing a block, any valid data it stores must be rewritten to a clean location. As an SSD eventually wears out with progressing number of erasures, the efficiency of the management algorithm has a significant impact on its endurance. In this paper we first formally define the SSD management problem. We then explore this problem from an algorithmic perspective, considering it in both offline and online settings. In the offline setting, we present a near-optimal algorithm that, given any input, performs a negligible number of rewrites (relative to the input length). We also discuss the hardness of the offline problem. In the online setting, we first consider algorithms that have no prior knowledge about the input. We prove that no deterministic algorithm outperforms the greedy algorithm in this setting, and discuss the possible benefit of randomization. We then augment our model, assuming that each request for a page arrives with a prediction of the next time the page is updated. We design an online algorithm that uses such predictions, and show that its performance improves as the prediction error decreases. We also show that the performance of our algorithm is never worse than that guaranteed by the greedy algorithm, even when the prediction error is large. We complement our theoretical findings with an empirical evaluation of our algorithms, comparing them with the state-of-the-art scheme. The results confirm that our algorithms exhibit an improved performance for a wide range of input traces.

[1]  J. Naor,et al.  Offline and Online Algorithms for SSD Management , 2022, SIGMETRICS.

[2]  Phillip B. Gibbons,et al.  Block-Granularity-Aware Caching , 2021, SPAA.

[3]  Heiner Litz,et al.  Reducing write amplification in flash by death-time prediction of logical block addresses , 2021, SYSTOR.

[4]  Bianca Schroeder,et al.  SSD-based Workload Characteristics and Their Performance Implications , 2021, ACM Trans. Storage.

[5]  Kane,et al.  Beyond the Worst-Case Analysis of Algorithms , 2020 .

[6]  Sergei Vassilvitskii,et al.  Algorithms with predictions , 2020, Beyond the Worst-Case Analysis of Algorithms.

[7]  André Brinkmann,et al.  FADaC: a self-adapting data classifier for flash memory , 2019, SYSTOR.

[8]  Eitan Yaakobi,et al.  A Case for Biased Programming in Flash , 2018, HotStorage.

[9]  Eitan Yaakobi,et al.  An Analysis of Flash Page Reuse With WOM Codes , 2018, ACM Trans. Storage.

[10]  Sergei Vassilvitskii,et al.  Competitive caching with machine learned advice , 2018, ICML.

[11]  Andrea C. Arpaci-Dusseau,et al.  The Unwritten Contract of Solid State Drives , 2017, EuroSys.

[12]  Ajit A. Diwan,et al.  Fragmented coloring of proper interval and split graphs , 2015, Discret. Appl. Math..

[13]  Vishal Misra,et al.  On the Optimality of Greedy Garbage Collection for SSDs , 2015, PERV.

[14]  Eitan Yaakobi,et al.  When do WOM codes improve the erasure factor in flash memories? , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[15]  Babak Falsafi,et al.  Unison Cache: A Scalable and Effective Die-Stacked DRAM Cache , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[16]  Benny Van Houdt,et al.  On the necessity of hot and cold data identification to reduce the write amplification in flash-based SSDs , 2014, Perform. Evaluation.

[17]  Sangyeun Cho,et al.  The Multi-streamed Solid-State Drive , 2014, HotStorage.

[18]  Peter Desnoyers,et al.  Analytic Models of SSD Write Performance , 2014, TOS.

[19]  Anastasia Ailamaki,et al.  Improving Flash Write Performance by Using Update Frequency , 2013, Proc. VLDB Endow..

[20]  Babak Falsafi,et al.  Die-stacked DRAM caches for servers: hit ratio, latency, or bandwidth? have it all with footprint cache , 2013, ISCA.

[21]  FalsafiBabak,et al.  Die-stacked DRAM caches for servers , 2013 .

[22]  Benny Van Houdt,et al.  A mean field model for a class of garbage collection algorithms in flash-based solid state drives , 2013, Queueing Systems.

[23]  Gabriel H. Loh,et al.  Fundamental Latency Trade-off in Architecting DRAM Caches: Outperforming Impractical SRAM-Tags with a Simple and Practical Design , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[24]  Brian M. Kurkoski,et al.  An improved analytic expression for write amplification in NAND flash , 2011, 2012 International Conference on Computing, Networking and Communications (ICNC).

[25]  Da-Wei Chang,et al.  ROSE: A Novel Flash Translation Layer for NAND Flash Memory Based on Hybrid Address Translation , 2011, IEEE Transactions on Computers.

[26]  David Hung-Chang Du,et al.  Hot data identification for flash-based storage systems using multiple bloom filters , 2011, 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST).

[27]  Sivan Toledo,et al.  Competitive analysis of flash memory algorithms , 2011, TALG.

[28]  Marcus Marrow,et al.  A closed-form expression for write amplification in NAND Flash , 2010, 2010 IEEE Globecom Workshops.

[29]  Dongkun Shin,et al.  ComboFTL: Improving performance and lifespan of MLC flash memory using SLC flash buffer , 2010, J. Syst. Archit..

[30]  Werner Bux,et al.  Performance of greedy garbage collection in flash-based solid-state drives , 2010, Perform. Evaluation.

[31]  Babak Falsafi,et al.  Using dead blocks as a virtual victim cache , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[32]  Evangelos Eleftheriou,et al.  Write amplification analysis in flash-based solid state drives , 2009, SYSTOR '09.

[33]  Jaehyuk Huh,et al.  Cache bursts: A new approach for eliminating dead blocks and increasing cache efficiency , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[34]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[35]  Tei-Wei Kuo,et al.  Efficient identification of hot data for flash memory storage systems , 2006, TOS.

[36]  Dror Rawitz,et al.  The hardness of cache conscious data placement , 2002, POPL '02.

[37]  James R. Larus,et al.  Cache-conscious structure layout , 1999, PLDI '99.

[38]  James R. Larus,et al.  Cache-conscious structure definition , 1999, PLDI '99.

[39]  Chandra Krintz,et al.  Cache-conscious data placement , 1998, ASPLOS VIII.

[40]  Rajeev Motwani,et al.  Storage management for evolving databases , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[41]  Amos Fiat,et al.  Competitive Paging Algorithms , 1991, J. Algorithms.

[42]  Robert E. Tarjan,et al.  Amortized efficiency of list update and paging rules , 1985, CACM.

[43]  Andrew Chi-Chih Yao,et al.  Probabilistic computations: Toward a unified measure of complexity , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).

[44]  Ni Xue,et al.  Reducing Garbage Collection Overhead in SSD Based on Workload Prediction , 2019, HotStorage.

[45]  Taejin Kim,et al.  Fully Automatic Stream Management for Multi-Streamed SSDs Using Program Contexts , 2019, FAST.

[46]  Joo Young Hwang,et al.  FStream: Managing Flash Streams in the File System , 2018, FAST.

[47]  Google,et al.  Improving Online Algorithms via ML Predictions , 2024, NeurIPS.

[48]  Xiao-Yu Hu,et al.  The Fundamental Limit of Flash Random Write Performance: Understanding, Analysis and Performance Modelling , 2010 .

[49]  Werner Bux,et al.  Performance Evaluation of the Write Operation In Flash-Based Solid-State Drives , 2009 .

[50]  James R. Goodman,et al.  The declining effectiveness of dynamic caching for general- purpose microprocessors , 1995 .