The Composite-file File System: Decoupling the One-to-One Mapping of Files and Metadata for Better Performance

Traditional file system optimizations typically use a one-to-one mapping of logical files to their physical metadata representations. This mapping results in missed opportunities for a class of optimizations in which such coupling is removed. We have designed, implemented, and evaluated a composite-file file system, which allows many-to-one mappings of files to metadata, and we have explored the design space of different mapping strategies. Under webserver and software development workloads, our empirical evaluation shows up to a 27% performance improvement. This result demonstrates the promise of composite files.

[1]  Kanad Ghose,et al.  hFS: a hybrid file system prototype for improving small file and metadata performance , 2007, EuroSys '07.

[2]  R. S. Fabry,et al.  A fast file system for UNIX , 1984, TOCS.

[3]  M. Frans Kaashoek,et al.  Embedded Inodes and Explicit Grouping: Exploiting Disk Bandwidth for Small Files , 1997, USENIX Annual Technical Conference.

[4]  Xiaoning Ding,et al.  A Prefetching Scheme Exploiting both Data Layout and Access History on Disk , 2013, TOS.

[5]  Yuanyuan Zhou,et al.  Association Proceedings of the Third USENIX Conference on File and Storage Technologies San Francisco , CA , USA March 31 – April 2 , 2004 , 2004 .

[6]  Andrea C. Arpaci-Dusseau,et al.  Consistency without ordering , 2012, FAST.

[7]  Andrea C. Arpaci-Dusseau,et al.  A file is not a file: understanding the I/O behavior of Apple desktop applications , 2011, SOSP 2011.

[8]  B. Prabavathy,et al.  A novel indexing scheme for efficient handling of small files in Hadoop Distributed File System , 2013, 2013 International Conference on Computer Communication and Informatics.

[9]  Sanjeev Kumar,et al.  Finding a Needle in Haystack: Facebook's Photo Storage , 2010, OSDI.

[10]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[11]  Scott A. Brandt,et al.  MRAMFS: a compressing file system for non-volatile RAM , 2004, The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings..

[12]  John S. Heidemann,et al.  File-system development with stackable layers , 1994, TOCS.

[13]  Kai Ren,et al.  TABLEFS: Enhancing Metadata Efficiency in the Local File System , 2013, USENIX Annual Technical Conference.

[14]  A. L. Narasimha Reddy,et al.  Umbrella file system: Storage management across heterogeneous devices , 2009, TOS.

[15]  Thomas E. Anderson,et al.  A Comparison of File System Workloads , 2000, USENIX Annual Technical Conference, General Track.

[16]  Gregory R. Ganger,et al.  Ursa minor: versatile cluster-based storage , 2005, FAST'05.

[17]  Darrell D. E. Long,et al.  Design and Implementation of a Predictive File Prefetching Algorithm , 2001, USENIX Annual Technical Conference, General Track.

[18]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[19]  Xiaoning Ding,et al.  DiskSeen: Exploiting Disk Layout and Access History to Enhance I/O Prefetch , 2007, USENIX Annual Technical Conference.

[20]  Jeffrey S. Vetter,et al.  Exploiting Lustre File Joining for Effective Collective IO , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[21]  Erez Zadok,et al.  To FUSE or Not to FUSE: Performance of User-Space File Systems , 2017, FAST.

[22]  Qinghua Zheng,et al.  A Novel Approach to Improving the Efficiency of Storing and Accessing Small Files on Hadoop: A Case Study by PowerPoint Files , 2010, 2010 IEEE International Conference on Services Computing.

[23]  Keith Bostic,et al.  A Pageable Memory Based Filesystem , 1990, USENIX Summer.

[24]  S. Jørgensen The art of computer systems performance analysis: Techniques for Experimental Design, Measurement, Simulation and Modeling. Raj Jain. John Wiley, New York. Hardcover, 720 p. U.S. $52.95. , 1992 .

[25]  Madalin Mihailescu,et al.  Context-Aware Prefetching at the Storage Server , 2008, USENIX Annual Technical Conference.

[26]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[27]  Jiang Zhou,et al.  Block2Vec: A Deep Learning Strategy on Mining Block Correlations in Storage Systems , 2016, 2016 45th International Conference on Parallel Processing Workshops (ICPPW).

[28]  Andrew S. Tanenbaum,et al.  Immediate files , 1984, Softw. Pract. Exp..

[29]  Josef Bacik,et al.  BTRFS: The Linux B-Tree Filesystem , 2013, TOS.

[30]  David S. Munro,et al.  In: Software-Practice and Experience , 2000 .