HSM2: A Hybrid and Scalable Metadata Management Method in Distributed File Systems

In the bigdata era, metadata performance is critical in modern distributed file systems. Traditionally, the metadata management strategies like the subtree partitioning method focus on keeping namespace locality, while the other ones like the hash-based mapping method aim to offer good load balance. Nevertheless, none of these methods achieve the two desirable properties simultaneously. To close this gap, in this paper, we propose a novel metadata management scheme, HSM\(^{2}\), which combines the subtree partitioning and hash-based mapping method together. We implemented HSM\(^{2}\) in CephFS, a widely deployed distributed file systems, and conducted a comprehensive set of metadata-intensive experiments. Experimental results show that HSM\(^{2}\) can achieve better namespace locality and load balance simultaneously. Compared with CephFS, HSM\(^{2}\) can reduce the completion time by 70% and achieve 3.9\(\times \) overall throughput speedup for a file-scanning workload.

[1]  Scott A. Brandt,et al.  Dynamic Metadata Management for Petabyte-Scale File Systems , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[2]  Cristina L. Abad,et al.  Metadata Traces and Workload Models for Evaluating Big Storage Systems , 2012, 2012 IEEE Fifth International Conference on Utility and Cloud Computing.

[3]  Ohad Rodeh,et al.  zFS - a scalable distributed file system using object disks , 2003, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings..

[4]  Andrea C. Arpaci-Dusseau,et al.  Analysis of HDFS under HBase: a facebook messages case study , 2014, FAST.

[5]  Eric Anderson,et al.  Capture, Conversion, and Analysis of an Intense NFS Workload , 2009, FAST.

[6]  Bin Zhou,et al.  Scalable Performance of the Panasas Parallel File System , 2008, FAST.

[7]  Jie Ma,et al.  Adaptive and scalable metadata management to support a trillion files , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[8]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[9]  Randy H. Katz,et al.  RAMA: An Easy-to-Use, High-Performance Parallel File System , 1997, Parallel Comput..

[10]  Jiwu Shu,et al.  Reconsidering Single Failure Recovery in Clustered File Systems , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[11]  Sadaf R. Alam,et al.  Parallel I/O and the metadata wall , 2011, PDSW '11.

[12]  Daniel J. Abadi,et al.  CalvinFS: Consistent WAN Replication and Scalable Metadata Management for Distributed File Systems , 2015, FAST.

[13]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[14]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[15]  Youyou Lu,et al.  LocoFS: A Loosely-Coupled Metadata Service for Distributed File Systems , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.

[16]  Thomas E. Anderson,et al.  A Comparison of File System Workloads , 2000, USENIX Annual Technical Conference, General Track.

[17]  Shankar Pasupathy,et al.  Measurement and Analysis of Large-Scale Network File System Workloads , 2008, USENIX Annual Technical Conference.

[18]  Robert B. Ross,et al.  Small-file access in parallel file systems , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[19]  Dror G. Feitelson,et al.  The Vesta parallel file system , 1996, TOCS.

[20]  Carl Smith,et al.  NFS Version 3: Design and Implementation , 1994, USENIX Summer.

[21]  Lin Xiao,et al.  ShardFS vs. IndexFS: replication vs. caching strategies for distributed metadata management in cloud storage systems , 2015, SoCC.

[22]  An-I Wang,et al.  The Composite-file File System: Decoupling the One-to-One Mapping of Files and Metadata for Better Performance , 2016, FAST.

[23]  Mahadev Satyanarayanan,et al.  Andrew: a distributed personal computing environment , 1986, CACM.

[24]  Andrew R. Cherenson,et al.  The Sprite network operating system , 1988, Computer.

[25]  Mahadev Satyanarayanan,et al.  Coda: a highly available file system for a distributed workstation environment , 1989, Proceedings of the Second Workshop on Workstation Operating Systems.

[26]  Christopher Hertel Implementing CIFS: The Common Internet File System , 2003 .

[27]  Michael J. Callahan,et al.  The InterMezzo File System , 1999 .