The Design and Implementation of the Wave Transactional Filesystem

This paper introduces the Wave Transactional Filesystem (WTF), a novel, transactional, POSIX-compatible filesystem based on a new file slicing API that enables efficient file transformations. WTF provides transactional access to a distributed filesystem, eliminating the possibility of inconsistencies across multiple files. Further, the file slicing API enables applications to construct files from the contents of other files without having to rewrite or relocate data. Combined, these enable a new class of high-performance applications. Experiments show that WTF can qualitatively outperform the industry-standard HDFS distributed filesystem, up to a factor of four in a sorting benchmark, by reducing I/O costs. Microbenchmarks indicate that the new features of WTF impose only a modest overhead on top of the POSIX-compatible API.

[1]  Hai Jin,et al.  The Zebra Striped Network File System , 2002 .

[2]  Michael A. Olson,et al.  The Design and Implementation of the Inversion File System , 1993, USENIX Winter.

[3]  Bin Zhou,et al.  Scalable Performance of the Panasas Parallel File System , 2008, FAST.

[4]  J. Howard Et El,et al.  Scale and performance in a distributed file system , 1988 .

[5]  Daniel J. Abadi,et al.  CalvinFS: Consistent WAN Replication and Scalable Metadata Management for Distributed File Systems , 2015, FAST.

[6]  Jon Howell,et al.  Flat Datacenter Storage , 2012, OSDI.

[7]  H. Apte,et al.  Serverless Network File Systems , 2006 .

[8]  Yang Wang,et al.  Robustness in the Salus Scalable Block Store , 2013, NSDI.

[9]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[10]  Brian F. Cooper Spanner: Google's globally-distributed database , 2013, SYSTOR '13.

[11]  Gregory R. Ganger,et al.  SpringFS: bridging agility and performance in elastic distributed storage , 2014, FAST.

[12]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[13]  Emin Gün Sirer,et al.  Warp: Lightweight Multi-Key Transactions for Key-Value Stores , 2015, ArXiv.

[14]  Margo I. Seltzer Transaction support in a log-structured file system , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[15]  Sachin Katti,et al.  Copysets: Reducing the Frequency of Data Loss in Cloud Storage , 2013, USENIX Annual Technical Conference.

[16]  Austin Donnelly,et al.  Sierra: practical power-proportionality for data center storage , 2011, EuroSys '11.

[17]  Erez Zadok,et al.  Enabling Transactional File Access via Lightweight Kernel Extensions , 2009, FAST.

[18]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[19]  Jim Zelenka,et al.  A cost-effective, high-bandwidth storage architecture , 1998, ASPLOS VIII.

[20]  John H. Hartman,et al.  The Zebra striped network file system , 1995, TOCS.

[21]  Antony I. T. Rowstron,et al.  Pelican: A Building Block for Exascale Cold Data Storage , 2014, OSDI.

[22]  Darrell D. E. Long,et al.  Swift: Using Distributed Disk Striping to Provide High I/O Data Rates , 1991, Comput. Syst..

[23]  Daniel J. Abadi,et al.  Calvin: fast distributed transactions for partitioned database systems , 2012, SIGMOD Conference.

[24]  Eric A. Brewer,et al.  Stasis: flexible transactional storage , 2006, OSDI '06.

[25]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[26]  John Rosenberg,et al.  Distributed persistent stores , 1993, Microprocess. Microsystems.

[27]  Emin Gün Sirer,et al.  HyperDex: a distributed, searchable key-value store , 2012, SIGCOMM '12.

[28]  Cory Hill,et al.  f4: Facebook's Warm BLOB Storage System , 2014, OSDI.

[29]  Frank B. Schmuck,et al.  Experience with transactions in QuickSilver , 1991, SOSP '91.

[30]  GhemawatSanjay,et al.  The Google file system , 2003 .

[31]  Rodrigo Rodrigues,et al.  Transactional file systems can be fast , 2004, EW 11.

[32]  Sanjay Ghemawat,et al.  MapReduce: a flexible data processing tool , 2010, CACM.

[33]  Chandramohan A. Thekkath,et al.  Petal: distributed virtual disks , 1996, ASPLOS VII.

[34]  Chandramohan A. Thekkath,et al.  Frangipani: a scalable distributed file system , 1997, SOSP.

[35]  Erez Zadok,et al.  Extending ACID semantics to the file system , 2007, TOS.

[36]  Sean Quinlan,et al.  GFS: Evolution on Fast-forward , 2009, ACM Queue.

[37]  Paulo Guedes,et al.  The PerDiS FS: a transactional file system for a distributed persistent store , 1998, ACM SIGOPS European Workshop.

[38]  Karsten Schwan,et al.  Robust and flexible power-proportional storage , 2010, SoCC '10.

[39]  Asim Kadav,et al.  Blizzard: Fast, Cloud-scale Block Storage for Cloud-oblivious Applications , 2014, NSDI.