Accelerating Cloud Storage System with Byte-Addressable Non-Volatile Memory

As building block for cloud storage, distributed file system uses underlying local file systems to manage objects. However, the underlying file system, which is limited by metadata and journaling I/O, significantly affects the performance of the distributed file system. This paper presents an NVM-based file system (referred to as NV-Booster) to accelerate object access for storage node. The NV-Booster leverages byte-addressability and persistency of nonvolatile memory (NVM) to speedup metadata accesses and file system journaling. With NV-Booster, metadata is kept in NVM and accessed in byte-addressable manner through memory bus, while object is stored on hard disk and accessed from I/O bus. In addition, proposed NV-Booster enables fast object search and mapping between object ID and on-disk location with an efficient in-memory namespace management. NV-Booster is implemented in kernel space with NVDIMM and has been extensively evaluated under various workloads. Our experiments show that NV-Booster improves Ceph performance up to 10X, compared to the Ceph with existing local file systems.

[1]  Roy H. Campbell,et al.  Consistent and Durable Data Structures for Non-Volatile Byte-Addressable Memory , 2011, FAST.

[2]  Eunji Lee,et al.  Unioning of the buffer cache and journaling layers with non-volatile memory , 2013, FAST.

[3]  Sanjay Kumar,et al.  System software for persistent memory , 2014, EuroSys '14.

[4]  Jaemin Jung,et al.  FRASH: Exploiting storage class memory in hybrid file system for hierarchical storage , 2010, TOS.

[5]  Christopher Frost,et al.  Better I/O through byte-addressable, persistent memory , 2009, SOSP '09.

[6]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[7]  Winfried W. Wilcke,et al.  Storage-class memory: The next storage system technology , 2008, IBM J. Res. Dev..

[8]  Shankar Pasupathy,et al.  Measurement and Analysis of Large-Scale Network File System Workloads , 2008, USENIX Annual Technical Conference.

[9]  A. L. Narasimha Reddy,et al.  SCMFS: A file system for Storage Class Memory , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[10]  Hyojun Kim,et al.  Evaluating Phase Change Memory for Enterprise Storage Systems: A Study of Caching and Tiering Approaches , 2014, TOS.

[11]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[12]  S. Weil Leveraging Intra-object Locality with EBOFS , 2022 .

[13]  Orion Hodson,et al.  Whole-system persistence , 2012, ASPLOS XVII.

[14]  Michael M. Swift,et al.  Aerie: flexible file-system interfaces to storage-class memory , 2014, EuroSys '14.

[15]  Rajesh K. Gupta,et al.  Moneta: A High-Performance Storage Array Architecture for Next-Generation, Non-volatile Memories , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[16]  Ethan L. Miller,et al.  Reliability mechanisms for file systems using non-volatile memory as a metadata store , 2006, EMSOFT '06.

[17]  Lingkun Wu,et al.  FSMAC: A file system metadata accelerator with non-volatile memory , 2013, 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST).

[18]  Kyu Ho Park,et al.  High-Performance Scalable Flash File System Using Virtual Metadata Storage with Phase-Change RAM , 2011, IEEE Transactions on Computers.

[19]  Scott A. Brandt,et al.  OBFS: A File System for Object-Based Storage Devices , 2004, MSST.

[20]  Geoffrey H. Kuenning,et al.  The Conquest file system: Better performance through a disk/persistent-RAM hybrid design , 2006, TOS.

[21]  Bingsheng He,et al.  NV-Tree: Reducing Consistency Cost for NVM-based Single Level Systems , 2015, FAST.

[22]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[23]  Ethan L. Miller,et al.  PRIMS: making NVRAM suitable for extremely reliable storage , 2007 .