Flexible, Wide-Area Storage for Distributed Systems with WheelFS

WheelFS is a wide-area distributed storage system intended to help multi-site applications share data and gain fault tolerance. WheelFS takes the form of a distributed file system with a familiar POSIX interface. Its design allows applications to adjust the tradeoff between prompt visibility of updates from other sites and the ability for sites to operate independently despite failures and long delays. WheelFS allows these adjustments via semantic cues, which provide application control over consistency, failure handling, and file and replica placement. WheelFS is implemented as a user-level file system and is deployed on PlanetLab and Emulab. Three applications (a distributed Web cache, an email service and large file distribution) demonstrate that WheelFS's file system interface simplifies construction of distributed applications by allowing reuse of existing software. These applications would perform poorly with the strict semantics implied by a traditional file system interface, but by providing cues to WheelFS they are able to achieve good performance. Measurements show that applications built on WheelFS deliver comparable performance to services such as CoralCDN and BitTorrent that use specialized wide-area storage systems.

[1]  Robert Grimm,et al.  PADS: A Policy Architecture for Distributed Storage Systems , 2009, NSDI.

[2]  Venugopalan Ramasubramanian,et al.  Optimal Resource Utilization in Content Distribution Networks , 2005 .

[3]  Chandramohan A. Thekkath,et al.  Frangipani: a scalable distributed file system , 1997, SOSP.

[4]  Lei Gao,et al.  PRACTI Replication , 2006, NSDI.

[5]  G. Cox,et al.  ~ " " " ' l I ~ " " -" . : -· " J , 2006 .

[6]  Eric A. Brewer,et al.  NinjaMail: the design of a high-performance clustered, distributed e-mail system , 2000, Proceedings 2000. International Workshop on Parallel Processing.

[7]  David Mazières,et al.  Democratizing Content Publication with Coral , 2004, NSDI.

[8]  Dan Walsh,et al.  Design and implementation of the Sun network filesystem , 1985, USENIX Conference Proceedings.

[9]  Robert Tappan Morris,et al.  UsenetDHT: A Low-Overhead Design for Usenet , 2008, NSDI.

[10]  Amin Vahdat,et al.  Design and evaluation of a conit-based continuous consistency model for replicated services , 2002, TOCS.

[11]  Dennis Shasha,et al.  Secure Untrusted Data Repository (SUNDR) , 2004, OSDI.

[12]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[13]  Miguel Castro,et al.  Farsite: federated, available, and reliable storage for an incompletely trusted environment , 2002, OPSR.

[14]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[15]  Marvin Theimer,et al.  Managing update conflicts in Bayou, a weakly connected replicated storage system , 1995, SOSP.

[16]  Magnus Karlsson,et al.  Taming aggressive replication in the Pangaea wide-area file system , 2002, OPSR.

[17]  Nancy A. Lynch,et al.  Eventually-Serializable Data Services , 1999, Theor. Comput. Sci..

[18]  B. Cohen,et al.  Incentives Build Robustness in Bit-Torrent , 2003 .

[19]  Mike Hibler,et al.  An integrated experimental environment for distributed systems and networks , 2002, OPSR.

[20]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[21]  Michael Burrows,et al.  The Chubby Lock Service for Loosely-Coupled Distributed Systems , 2006, OSDI.

[22]  Siddhartha Annapureddy,et al.  Shark: scaling file servers via cooperative caching , 2005, NSDI.

[23]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[24]  Robert Tappan Morris,et al.  Don't Give Up on Distributed File Systems , 2007, IPTPS.

[25]  William E. Allcock,et al.  The Globus Striped GridFTP Framework and Server , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[26]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[27]  Robert Tappan Morris,et al.  Ivy: a read/write peer-to-peer file system , 2002, OSDI '02.

[28]  R. Grimm,et al.  PADS : A Policy Architecture for Building Distributed Storage Systems , 2008 .

[29]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[30]  Eddie Kohler,et al.  The Click modular router , 1999, SOSP.

[31]  David E. Culler,et al.  Operating Systems Support for Planetary-Scale Network Services , 2004, NSDI.

[32]  M. Frans Kaashoek,et al.  Vivaldi: a decentralized network coordinate system , 2004, SIGCOMM 2004.

[33]  KyoungSoo Park,et al.  Scale and Performance in the CoBlitz Large-File Distribution Service , 2006, NSDI.

[34]  Nancy A. Lynch,et al.  Eventually-serializable data services , 1996, PODC '96.

[35]  Mahadev Satyanarayanan,et al.  The ITC distributed file system: principles and design , 1985, SOSP '85.

[36]  Mahadev Satyanarayanan,et al.  Coda: A Highly Available File System for a Distributed Workstation Environment , 1990, IEEE Trans. Computers.

[37]  M LevyHenry,et al.  Manageability, availability and performance in Porcupine , 1999 .

[38]  H. Apte,et al.  Serverless Network File Systems , 2006 .

[39]  Stefan Savage,et al.  Total Recall: System Support for Automated Availability Management , 2004, NSDI.

[40]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[41]  GhemawatSanjay,et al.  The Google file system , 2003 .

[42]  Marcos K. Aguilera,et al.  Sinfonia: a new paradigm for building scalable distributed systems , 2007, SOSP.

[43]  David Mazières,et al.  OASIS: Anycast for Any Service , 2006, NSDI.

[44]  Ben Y. Zhao,et al.  Awarded Best Student Paper! - Pond: The OceanStore Prototype , 2003 .

[45]  David R. Karger,et al.  Chord: a scalable peer-to-peer lookup protocol for internet applications , 2003, TNET.

[46]  Ben Y. Zhao,et al.  Pond: The OceanStore Prototype , 2003, FAST.

[47]  Brian N. Bershad,et al.  Manageability, availability and performance in Porcupine: a highly scalable, cluster-based mail service , 1999, TOCS.

[48]  Jason Flinn,et al.  Energy-Efficiency and Storage Flexibility in the Blue File System , 2004, OSDI.

[49]  Robert Grimm,et al.  PADRE : A Policy Architecture for building Data REplication systems , 2007 .