BuffetFS: Serve Yourself Permission Checks without Remote Procedure Calls

The remote procedure call (a.k.a. RPC) latency becomes increasingly significant in a distributed file system. We propose BuffetFS, a user-level file system that optimizes I/O performance by eliminating the RPCs caused by open() operation. By leveraging open() from file servers to clients, BuffetFS can restrain the procedure calls for permission checks locally, hence avoid RPCs during the initial stage to access a file. BuffetFS can further reduce response time when users are accessing a large number of small files. We implement a BuffetFS prototype and integrate it into a storage cluster. Our preliminary evaluation results show that BuffetFS can offer up to 70% performance gain compared to the Lustre file system.

[1]  GhemawatSanjay,et al.  The Google file system , 2003 .

[2]  Kyoung Soo Bok,et al.  An efficient distributed caching for accessing small files in HDFS , 2017, Cluster Computing.

[3]  Youyou Lu,et al.  A Flattened Metadata Service for Distributed File Systems , 2018, IEEE Transactions on Parallel and Distributed Systems.

[4]  Weiguo Liu,et al.  End-to-end I/O Monitoring on Leading Supercomputers , 2022, NSDI.

[5]  Lustre : A Scalable , High-Performance File System Cluster , 2003 .

[6]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[7]  Erez Zadok,et al.  Newer Is Sometimes Better: An Evaluation of NFSv4.1 , 2015, SIGMETRICS.

[8]  Daniel J. Abadi,et al.  CalvinFS: Consistent WAN Replication and Scalable Metadata Management for Distributed File Systems , 2015, FAST.

[9]  Kai Ren,et al.  IndexFS: Scaling File System Metadata Performance with Stateless Caching and Bulk Insertion , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[10]  Erez Zadok,et al.  vNFS: Maximizing NFS Performance with Compounds and Vectorized I/O , 2017, FAST.

[11]  Erez Zadok,et al.  SeMiNAS: A Secure Middleware for Wide-Area Network-Attached Storage , 2016, SYSTOR.

[12]  Lin Xiao,et al.  ShardFS vs. IndexFS: replication vs. caching strategies for distributed metadata management in cloud storage systems , 2015, SoCC.

[13]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.