Sampling-based garbage collection metadata management scheme for flash-based storage

Existing garbage collection algorithms for the flash-based storage use score-based heuristics to select victim blocks for reclaiming free space and wear leveling. The score for a block is estimated using metadata information such as age, block utilization, and erase count. To quickly find a victim block, these algorithms maintain a priority queue in the SRAM of the storage controller. This priority queue takes O(K) space, where K stands for flash storage capacity in total number of blocks. As the flash capacity scales to larger size, K also scales to larger value. However, due to higher price per byte, SRAM will not scale proportionately. In this case, due to SRAM scarcity, it will be challenging to implement a larger priority queue in the limited SRAM of a large-capacity flash storage. In addition to space issue, with any update in the metadata information, the priority queue needs to be continuously updated, which takes O(lg(K)) operations. This computation overhead also increases with the increase of flash capacity. In this paper, we have taken a novel approach to solve the garbage collection metadata management problem of a large-capacity flash storage. We propose a sampling-based approach to approximate existing garbage collection algorithms in the limited SRAM space. Since these algorithms are heuristic-based, our sampling-based algorithm will perform as good as unsampled (original) algorithm, if we choose good samples to make garbage collection decisions. We propose a very simple policy to choose samples. Our experimental results show that small number of samples are good enough to emulate existing garbage collection algorithms.

[1]  Dongchul Park,et al.  CFTL: A Convertible Flash Translation Layer with Consideration of Data Access Patterns , 2009 .

[2]  Heeseung Jo,et al.  A group-based wear-leveling algorithm for large-capacity flash memory storage systems , 2007, CASES '07.

[3]  Han-joon Kim,et al.  An Effective Flash Memory Manager for Reliable Flash Memory Space Management , 2002 .

[4]  Sang-Won Lee,et al.  A log buffer-based flash translation layer using fully-associative sector translation , 2007, TECS.

[5]  David Woodhouse,et al.  JFFS : The Journalling Flash File System , 2001 .

[6]  Ronald L. Rivest,et al.  Introduction to Algorithms, Second Edition , 2001 .

[7]  Sang Lyul Min,et al.  A space-efficient flash translation layer for CompactFlash systems , 2002, IEEE Trans. Consumer Electron..

[8]  Ruei-Chuan Chang,et al.  Cleaning policies in mobile computers using flash memory , 1999, J. Syst. Softw..

[9]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[10]  Heeseung Jo,et al.  A superblock-based flash translation layer for NAND flash memory , 2006, EMSOFT '06.

[11]  Youngjae Kim,et al.  DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings , 2009, ASPLOS.

[12]  David J. Lilja,et al.  Sampling-based Metadata Management for Flash Storage , 2010 .

[13]  Tei-Wei Kuo,et al.  An Adaptive Two-Level Management for the Flash Translation Layer in Embedded Systems , 2006, 2006 IEEE/ACM International Conference on Computer Aided Design.

[14]  Sivan Toledo,et al.  Algorithms and data structures for flash memories , 2005, CSUR.

[15]  Young-Jin Kim,et al.  LAST: locality-aware sector translation for NAND flash memory-based storage systems , 2008, OPSR.

[16]  Konstantinos Psounis,et al.  Efficient randomized web-cache replacement schemes using samples from past eviction times , 2002, TNET.

[17]  KimJin-Soo,et al.  A reconfigurable FTL (flash translation layer) architecture for NAND flash-based applications , 2008 .

[18]  Hiroshi Motoda,et al.  A Flash-Memory Based File System , 1995, USENIX.