Reliability mechanisms for file systems using non-volatile memory as a metadata store

Portable systems such as cell phones and portable media players commonly use non-volatile RAM (NVRAM) to hold all of their data and metadata, and larger systems can store metadata in NVRAM to increase file system performance by reducing synchronization and transfer overhead between disk and memory data structures. Unfortunately, wayward writes from buggy software and random bit flips may result in an unreliable persistent store. We introduce two orthogonal and complementary approaches to reliably storing file system structures in NVRAM. First, we reinforce hardware and operating system memory consistency by employing page-level write protection and error correcting codes. Second, we perform on-line consistency checking of the filesystem structures by replaying logged file system transactions on copied data structures; a structure is consistent if the replayed copy matches its live counterpart. Our experiments show that the protection mechanisms can increase fault tolerance by six orders of magnitude while incurring an acceptable amount of overhead on writes to NVRAM. Since NVRAM is much faster and consumes far less power than disk-based storage, the added overhead of error checking leaves an NVRAM-based system both faster and more reliable than a disk-based system. Additionally, our techniques can be implemented on systems lacking hardware support for memory management, allowing them to be used on lowend and embedded systems without an MMU.

[1]  Michael Stonebraker,et al.  Using Write Protected Data Structures To Improve Software Fault Tolerance in Highly Available Database Management Systems , 1991, VLDB.

[2]  Mary Baker,et al.  Non-volatile memory for fast, reliable file systems , 1992, ASPLOS V.

[3]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[4]  Michael Wu,et al.  eNVy: a non-volatile, main memory storage system , 1994, ASPLOS VI.

[5]  Jehoshua Bruck,et al.  EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures , 1995, IEEE Trans. Computers.

[6]  Wei Hu,et al.  Scalability in the XFS File System , 1996, USENIX Annual Technical Conference.

[7]  Peter M. Chen,et al.  The Rio file cache: surviving operating system crashes , 1996, ASPLOS VII.

[8]  Michael A. Bender,et al.  Fault tolerant data structures , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[9]  Peter M. Chen,et al.  Free transactions with Rio Vista , 1997, SOSP.

[10]  Thomas E. Anderson,et al.  A Comparison of File System Workloads , 2000, USENIX Annual Technical Conference, General Track.

[11]  Rajkumar Buyya,et al.  High Performance Mass Storage and Parallel I/O: Technologies and Applications , 2001 .

[12]  Scott A. Brandt,et al.  HeRMES: high-performance reliable MRAM-enabled storage , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[13]  Marshall K. McKusick,et al.  Running "fsck" in the Background , 2002, BSDCon.

[14]  Geoffrey H. Kuenning,et al.  Conquest: Better Performance Through a Disk/Persistent-RAM Hybrid File System , 2002, USENIX Annual Technical Conference, General Track.

[15]  Tzi-cker Chiueh,et al.  Design, implementation, and evaluation of repairable file service , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[16]  J. Nahas,et al.  A 4Mb 0.18 /spl mu/m 1T1MTJ Toggle MRAM memory , 2004, 2004 IEEE International Solid-State Circuits Conference (IEEE Cat. No.04CH37519).

[17]  Dan Raphaeli,et al.  The burst error correcting capabilities of a simple array code , 2005, IEEE Transactions on Information Theory.

[18]  Carlos Maltzahn,et al.  Richer file system metadata using links and attributes , 2005, 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST'05).

[19]  Carlos Maltzahn,et al.  LiFS: An Attribute-Rich File System for Storage Class Memories , 2006 .