DCR: Deterministic Crash Recovery for NAND Flash Storage Systems

NAND flash memory has been widely adopted as the storage medium. As power failure may occur at any time and result in data loss, crash recovery becomes vitally important in NAND flash memory storage systems. Since flash translation layers (FTLs) are used to manage flash memory, the crash recovery problem in NAND flash is how to efficiently and effectively recover FTL metadata with consistency after system crash. In this paper, we present deterministic crash recovery (DCR) that adopts a deterministic approach for crash recovery in NAND flash storage systems. The basic idea is to exploit the determinism of FTLs and reproduce events that happened between the last checkpoint and the crash point during crash recovery. Different from existing approaches that need to scan the whole flash memory chip, DCR can recover the system more efficiently by only checking a limited number of blocks based on deterministic FTL operations. We have implemented DCR in an FTL and compared it with the representative version-based and power loss recovery schemes based on an ARM-based embedded system. Experimental results show that DCR can greatly reduce the recovery time and guarantee the consistency of FTL metadata after recovery.

[1]  Seyed Nima Mozaffari,et al.  More Efficient Testing of Metal-Oxide Memristor–Based Memory , 2017, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[2]  Liang Shi,et al.  Exploiting Process Variation for Write Performance Improvement on NAND Flash Memory Storage Systems , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[3]  Zhipeng Li,et al.  Workload-Aware Elastic Striping With Hot Data Identification for SSD RAID Arrays , 2017, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[4]  Eunji Lee,et al.  A Unified Buffer Cache Architecture that Subsumes Journaling Functionality via Nonvolatile Memory , 2014, TOS.

[5]  Tei-Wei Kuo,et al.  Virtual flash chips: Rethinking the layer design of flash devices to improve data recoverability , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[6]  Hyojun Kim,et al.  Evaluating Phase Change Memory for Enterprise Storage Systems: A Study of Caching and Tiering Approaches , 2014, TOS.

[7]  Yiran Chen,et al.  Emerging non-volatile memories: Opportunities and challenges , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[8]  Khaled Ben Letaief,et al.  Dynamic Computation Offloading for Mobile-Edge Computing With Energy Harvesting Devices , 2016, IEEE Journal on Selected Areas in Communications.

[9]  Chao Wu,et al.  Lightweight Data Compression for Mobile Flash Storage , 2017, TECS.

[10]  Edwin Hsing-Mean Sha,et al.  Exploiting parallelism in I/O scheduling for access conflict minimization in flash-based solid state drives , 2014, 2014 30th Symposium on Mass Storage Systems and Technologies (MSST).

[11]  Yeonseung Ryu,et al.  PORCE: An efficient power off recovery scheme for flash memory , 2008, J. Syst. Archit..

[12]  Jongmoo Choi,et al.  Chip-Level RAID with Flexible Stripe Size and Parity Placement for Enhanced SSD Reliability , 2016, IEEE Transactions on Computers.

[13]  Hong Jiang,et al.  Improving Performance for Flash-Based Storage Systems through GC-Aware Cache Management , 2017, IEEE Transactions on Parallel and Distributed Systems.

[14]  Liang Shi,et al.  Asymmetric Error Rates of Cell States Exploration for Performance Improvement on Flash Memory Based Storage Systems , 2017, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[15]  Renhai Chen,et al.  Deterministic crash recovery for NAND flash based storage systems , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[16]  Youyou Lu,et al.  Blurred Persistence , 2016, ACM Trans. Storage.

[17]  Peter Desnoyers,et al.  Analytic Models of SSD Write Performance , 2014, TOS.

[18]  Nikil D. Dutt,et al.  Meta-Cure: A reliability enhancement strategy for metadata in NAND flash memory storage systems , 2012, DAC Design Automation Conference 2012.

[19]  Myoungsoo Jung Exploring Parallel Data Access Methods in Emerging Non-Volatile Memory Systems , 2017, IEEE Transactions on Parallel and Distributed Systems.

[20]  Sanghyuk Jung,et al.  Data loss recovery for power failure in flash memory storage systems , 2015, J. Syst. Archit..

[21]  Yiran Chen,et al.  FlexLevel NAND Flash Storage System Design to Reduce LDPC Latency , 2017, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[22]  Cong Xu,et al.  Impact of process variations on emerging memristor , 2010, Design Automation Conference.

[23]  Li-Pin Chang,et al.  On efficient wear leveling for large-scale flash-memory storage systems , 2007, SAC '07.

[24]  Jingtong Hu,et al.  State Asymmetry Driven State Remapping in Phase Change Memory , 2017, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[25]  Amin Jadidi,et al.  HL-PCM: MLC PCM Main Memory with Accelerated Read , 2017, IEEE Transactions on Parallel and Distributed Systems.

[26]  Young-Jin Kim,et al.  Exploiting Sequential and Temporal Localities to Improve Performance of NAND Flash-Based SSDs , 2016, TOS.

[27]  Heon Young Yeom,et al.  Optimizing I/O Operations in File Systems for Fast Storage Devices , 2017, IEEE Transactions on Computers.

[28]  Jianhua Li,et al.  Cooperating Virtual Memory and Write Buffer Management for Flash-Based Storage Systems , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[29]  Tei-Wei Kuo,et al.  Endurance Enhancement of Flash-Memory Storage, Systems: An Efficient Static Wear Leveling Design , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[30]  Kern Koh,et al.  A fast start-up technique for flash memory based computing systems , 2005, SAC '05.

[31]  Yiran Chen,et al.  Multi-level cell STT-RAM: Is it realistic or just a dream? , 2012, 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[32]  Onur Mutlu,et al.  Data retention in MLC NAND flash memory: Characterization, optimization, and recovery , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[33]  Heon Young Yeom,et al.  Optimizing file systems for fast storage devices , 2015, SYSTOR.

[34]  Wei-Kuan Shih,et al.  wrJFS: A Write-Reduction Journaling File System for Byte-addressable NVRAM , 2018, IEEE Transactions on Computers.

[35]  Hao Yan,et al.  FlowPaP and FlowReR , 2017, ACM Trans. Embed. Comput. Syst..

[36]  Dongwoo Kang,et al.  Amnesic cache management for non-volatile memory , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).

[37]  Steven Swanson,et al.  Underpowering NAND flash: Profits and perils , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[38]  Rui Mao,et al.  P-Alloc , 2017, ACM Trans. Embed. Comput. Syst..

[39]  Steven Swanson,et al.  Understanding the impact of power loss on flash memory , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[40]  Zili Shao,et al.  A Space Reuse Strategy for Flash Translation Layers in SLC NAND Flash Memory Storage Systems , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[41]  Yiran Chen,et al.  Geometry variations analysis of TiO2 thin-film and spintronic memristors , 2011, 16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011).

[42]  Xiaodong Zhang,et al.  Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[43]  Sang Lyul Min,et al.  A space-efficient flash translation layer for CompactFlash systems , 2002, IEEE Trans. Consumer Electron..

[44]  Edwin Hsing-Mean Sha,et al.  Durable Address Translation in PCM-Based Flash Storage Systems , 2017, IEEE Transactions on Parallel and Distributed Systems.

[45]  Jongmoo Choi,et al.  WARM: Improving NAND flash memory lifetime with write-hotness aware retention management , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).

[46]  Hyeonsang Eom,et al.  Optimizing the Block I/O Subsystem for Fast Storage Devices , 2014, ACM Trans. Comput. Syst..

[47]  Tei-Wei Kuo,et al.  A file-system-aware FTL design for flash-memory storage systems , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[48]  Sang-Won Lee,et al.  FAST: An Efficient Flash Translation Layer for Flash Memory , 2006, EUC Workshops.

[49]  Seung-Ho Lim,et al.  Efficient logging of metadata using NVRAM for NAND flash based file system , 2012, IEEE Transactions on Consumer Electronics.

[50]  Hsin-Hung Lin,et al.  Timing Analysis of System Initialization and Crash Recovery for a Segment-Based Flash Translation Layer , 2012, TODE.

[51]  Jongmoo Choi,et al.  Incremental redundancy to reduce data retention errors in flash-based SSDs , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).

[52]  Gerard Ledwich,et al.  Coordinated Control of Grid-Connected Photovoltaic Reactive Power and Battery Energy Storage Systems to Improve the Voltage Profile of a Residential Distribution Feeder , 2014, IEEE Transactions on Industrial Informatics.

[53]  H. Vincent Poor,et al.  Cooperation and Storage Tradeoffs in Power Grids With Renewable Energy Resources , 2014, IEEE Journal on Selected Areas in Communications.

[54]  Sanam Shahla Rizvi,et al.  JAM: justifiable allocation of memory with efficient mounting and fast crash recovery for NAND flash memory file systems , 2010, Int. Arab J. Inf. Technol..

[55]  Tei-Wei Kuo,et al.  Garbage collection and wear leveling for flash memory: Past and future , 2014, 2014 International Conference on Smart Computing.

[56]  Tei-Wei Kuo,et al.  Virtual Flash Chips: Reinforcing the Hardware Abstraction Layer to Improve Data Recoverability of Flash Devices , 2016, IEEE Transactions on Computers.

[57]  Mahmut T. Kandemir,et al.  Physically addressed queueing (PAQ): Improving parallelism in solid state disks , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[58]  Zili Shao,et al.  MNFTL: An efficient flash translation layer for MLC NAND flash memory storage systems , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[59]  Jeffrey Katcher,et al.  PostMark: A New File System Benchmark , 1997 .

[60]  Chita R. Das,et al.  Architecting on-chip interconnects for stacked 3D STT-RAM caches in CMPs , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[61]  Sang-Won Lee,et al.  A survey of Flash Translation Layer , 2009, J. Syst. Archit..

[62]  Heeseung Jo,et al.  A superblock-based flash translation layer for NAND flash memory , 2006, EMSOFT '06.

[63]  Chia-Lin Yang,et al.  Energy-Aware Flash Memory Management in Virtual Memory System , 2008 .

[64]  Hyokyung Bahn,et al.  CLOCK-DWF: A Write-History-Aware Page Replacement Algorithm for Hybrid PCM and DRAM Memory Architectures , 2014, IEEE Transactions on Computers.

[65]  Hong Jiang,et al.  LDM: Log Disk Mirroring with Improved Performance and Reliability for SSD-Based Disk Arrays , 2016, TOS.

[66]  Joonwon Lee,et al.  Exploiting Internal Parallelism of Flash-based SSDs , 2010, IEEE Computer Architecture Letters.

[67]  Tei-Wei Kuo,et al.  Real-time garbage collection for flash-memory storage systems of real-time embedded systems , 2004, TECS.

[68]  Tong Zhang,et al.  Error Rate-Based Wear-Leveling for nand Flash Memory at Highly Scaled Technology Nodes , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[69]  Nikil D. Dutt,et al.  A Reliability Enhanced Address Mapping Strategy for Three-Dimensional (3-D) NAND Flash Memory , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[70]  Gi-Ho Park,et al.  NVM Way Allocation Scheme to Reduce NVM Writes for Hybrid Cache Architecture in Chip-Multiprocessors , 2017, IEEE Transactions on Parallel and Distributed Systems.

[71]  Abhinav Sharma,et al.  SWANS: An Interdisk Wear-Leveling Strategy for RAID-0 Structured SSD Arrays , 2016, TOS.