Evaluating the Price of Consistency in Distributed File Storage Services

Distributed file storage services (DFSS) such as Dropbox, iCloud, SkyDrive, or Google Drive, offer a filesystem interface to a distributed data store. DFSS usually differ in the consistency level they provide for concurrent accesses: a client might access a cached version of a file, see the immediate results of all prior operations, or temporarily observe an inconsistent state. The selection of a consistency level has a strong impact on performance. It is the result of an inherent tradeoff between three properties: consistency, availability, and partition-tolerance. Isolating and identifying the exact impact on performance is a difficult task, because DFSS are complex designs with multiple components and dependencies. Furthermore, each system has a different range of features, its own design and implementation, and various optimizations that do not allow for a fair comparison. In this paper, we make a step towards a principled comparison of DFSS components, focusing on the evaluation of consistency mechanisms. We propose a novel modular DFSS testbed named FlexiFS, which implements a range of state-of-the-art techniques for the distribution, replication, routing, and indexing of data. Using FlexiFS, we survey six consistency levels: linearizability, sequential consistency, and eventual consistency, each operating with and without close-to-open semantics. Our evaluation shows that: (i) as expected, POSIX semantics (i.e., linearizability without close-to-open semantics) harm performance; and (ii) when close-to-open semantics is in use, linearizability delivers performance similar to sequential or eventual consistency.

[1]  Jerome H. Saltzer,et al.  Chapter 1 – Systems , 2009 .

[2]  Gade Krishna,et al.  A scalable peer-to-peer lookup protocol for Internet applications , 2012 .

[3]  GhemawatSanjay,et al.  The Google file system , 2003 .

[4]  Marcos K. Aguilera,et al.  Sinfonia: a new paradigm for building scalable distributed systems , 2007, SOSP.

[5]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[6]  Corporate Ieee,et al.  Information Technology-Portable Operating System Interface , 1990 .

[7]  Pierre Sens,et al.  Pastis: A Highly-Scalable Multi-user Peer-to-Peer File System , 2005, Euro-Par.

[8]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1987, SOSP '87.

[9]  Stephen Gilmore,et al.  Flexible Skeletal Programming with eSkel , 2005, Euro-Par.

[10]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[11]  William I. Nowicki,et al.  NFS: Network File System Protocol specification , 1989, RFC.

[12]  Jerome H. Saltzer,et al.  Principles of Computer System Design: An Introduction , 2009 .

[13]  Pascal Felber,et al.  SPLAY: Distributed Systems Evaluation Made Simple (or How to Turn Ideas into Live Systems in a Breeze) , 2009, NSDI.

[14]  Abraham Silberschatz,et al.  Distributed file systems: concepts and examples , 1990, CSUR.

[15]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[16]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[17]  D. B. Davis,et al.  Sun Microsystems Inc. , 1993 .

[18]  Konstantin V. Shvachko,et al.  HDFS Scalability: The Limits to Growth , 2010, login Usenix Mag..

[19]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[20]  Leslie Lamport,et al.  How to Make a Correct Multiprocess Program Execute Correctly on a Multiprocessor , 1997, IEEE Trans. Computers.

[21]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[22]  Bruce Walker,et al.  The LOCUS distributed operating system , 1983, SOSP '83.

[23]  Yasushi Saito,et al.  Optimistic replication , 2005, CSUR.

[24]  P. Couvares Caching in the Sprite network file system , 2006 .

[25]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[26]  Mahadev Satyanarayanan,et al.  Coda: A Highly Available File System for a Distributed Workstation Environment , 1990, IEEE Trans. Computers.

[27]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[28]  Robert Tappan Morris,et al.  Ivy: a read/write peer-to-peer file system , 2002, OSDI '02.

[29]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[30]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[31]  Nancy A. Lynch,et al.  Eventually-Serializable Data Services , 1999, Theor. Comput. Sci..

[32]  Julia L. Lawall,et al.  VMKit: a substrate for managed runtime environments , 2010, VEE '10.

[33]  Nikos Tsikoudis,et al.  Scalability of Replicated Metadata Services in Distributed File Systems , 2012, DAIS.