Depot : Cloud storage with minimal trust ( extended version ) ∗

The paper describes the design, implementation, and evaluation of Depot, a cloud storage system that minimizes trust assumptions. Depot tolerates buggy or malicious behavior by any number of clients or servers, yet it provides safety and liveness guarantees to correct clients. Depot provides these guarantees using a two-layer architecture. First, Depot ensures that the updates observed by correct nodes are consistently ordered under Fork-JoinCausal consistency (FJC). FJC is a slight weakening of causal consistency that can be both safe and live despite faulty nodes. Second, Depot implements protocols that use this consistent ordering of updates to provide other desirable consistency, staleness, durability, and recovery properties. Our evaluation suggests that the costs of these guarantees are modest and that Depot can tolerate faults and maintain good availability, latency, overhead, and staleness even when significant faults occur.

[1]  Eric A. Brewer,et al.  TierStore: A Distributed Filesystem for Challenged Networks in Developing Regions , 2008, FAST.

[2]  Dennis Shasha,et al.  Secure Untrusted Data Repository (SUNDR) , 2004, OSDI.

[3]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[4]  Eduardo Pinheiro,et al.  Failure Trends in a Large Disk Drive Population , 2007, FAST.

[5]  Gil Neiger,et al.  Causal memory: definitions, implementation, and programming , 1995, Distributed Computing.

[6]  Victor Luchangco,et al.  Computation-centric memory models , 1998, SPAA '98.

[7]  Andreas Haeberlen,et al.  PeerReview: practical accountability for distributed systems , 2007, SOSP.

[8]  David Mazières,et al.  Separating key management from file system security , 1999, SOSP.

[9]  Mary Baker,et al.  Auditing to Keep Online Storage Services Honest , 2007, HotOS.

[10]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[11]  Mary Baker,et al.  Historic integrity in distributed systems , 2003 .

[12]  Lei Gao,et al.  PRACTI Replication , 2006, NSDI.

[13]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[14]  Michael K. Reiter,et al.  Low-overhead byzantine fault-tolerant storage , 2007, SOSP.

[15]  濱野 純 入門Git : The fast version control system , 2009 .

[16]  Jacob R. Lorch,et al.  TrInc: Small Trusted Hardware for Large Distributed Systems , 2009, NSDI.

[17]  Alley Stoughton,et al.  Detection of Mutual Inconsistency in Distributed Systems , 1983, IEEE Transactions on Software Engineering.

[18]  Petr Kuznetsov,et al.  Zeno: Eventually Consistent Byzantine-Fault Tolerance , 2009, NSDI.

[19]  Jason Flinn,et al.  Rethink the sync , 2006, OSDI '06.

[20]  Srinivasan Seshan,et al.  Subtleties in Tolerating Correlated Failures in Wide-area Storage Systems , 2006, NSDI.

[21]  Scott Shenker,et al.  Attested append-only memory: making adversaries stick to their word , 2007, SOSP.

[22]  Marvin Theimer,et al.  Session guarantees for weakly consistent replicated data , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[23]  Werner Vogels,et al.  Life is not a state-machine: the long road from research to production , 2006, PODC '06.

[24]  Robert Tappan Morris,et al.  Pastwatch: A Distributed Version Control System , 2006, NSDI.

[25]  Prashant J. Shenoy,et al.  Rules of thumb in data engineering , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[26]  Roger M. Needham,et al.  Grapevine: an exercise in distributed computing , 1982, CACM.

[27]  Craig A. N. Soules,et al.  Self-securing storage: protecting data in compromised systems , 2000, Foundations of Intrusion Tolerant Systems, 2003 [Organically Assured and Survivable Information Systems].

[28]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[29]  Idit Keidar,et al.  Venus: verification for untrusted cloud storage , 2010, CCSW '10.

[30]  Ramakrishna Kotla,et al.  SafeStore: A Durable and Practical Storage System , 2007, USENIX Annual Technical Conference.

[31]  Marvin Theimer,et al.  Flexible update propagation for weakly consistent replication , 1997, SOSP.

[32]  Qian Wang,et al.  Plutus: Scalable Secure File Sharing on Untrusted Storage , 2003, FAST.

[33]  Andreas Haeberlen,et al.  Glacier: highly durable, decentralized storage despite massive correlated failures , 2005, NSDI.

[34]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[35]  Mike Hibler,et al.  An integrated experimental environment for distributed systems and networks , 2002, OPSR.

[36]  Tejaswi Redkar,et al.  Windows Azure Platform , 2010 .

[37]  Brent Byunghoon Kang,et al.  S2d2: a framework for scalable and secure optimistic replication , 2004 .

[38]  Miguel Castro,et al.  Practical byzantine fault tolerance and proactive recovery , 2002, TOCS.

[39]  Archana Ganapathi,et al.  Why Do Internet Services Fail, and What Can Be Done About It? , 2002, USENIX Symposium on Internet Technologies and Systems.

[40]  Michael K. Reiter,et al.  On Consistency of Encrypted Files , 2006, DISC.

[41]  Catherine C. Marshall,et al.  Cimbiosys: a platform for content-based partial replication , 2009, NSDI 2009.

[42]  Abhi Shelat,et al.  Efficient fork-linearizable access to untrusted shared memory , 2007, PODC '07.

[43]  Ted Wobber,et al.  Policy-based access control for weakly consistent replication , 2010, EuroSys '10.

[44]  Michael K. Reiter,et al.  Byzantine quorum systems , 1997, STOC '97.

[45]  Junfeng Yang,et al.  EXPLODE: a lightweight, general system for finding serious storage system errors , 2006, OSDI '06.

[46]  David Mazières,et al.  Beyond One-Third Faulty Replicas in Byzantine Fault Tolerant Systems , 2007, NSDI.

[47]  John S. Heidemann,et al.  Resolving File Conflicts in the Ficus File System , 1994, USENIX Summer.

[48]  Idit Keidar,et al.  Fail-Aware Untrusted Storage , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[49]  Hovav Shacham,et al.  SiRiUS: Securing Remote Untrusted Storage , 2003, NDSS.

[50]  Mahadev Satyanarayanan,et al.  Disconnected Operation in the Coda File System , 1999, Mobidata.

[51]  Ariel J. Feldman,et al.  SPORC: Group Collaboration using Untrusted Cloud Resources , 2010, OSDI.

[52]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[53]  Hakim Weatherspoon,et al.  RACS: a case for cloud storage diversity , 2010, SoCC '10.

[54]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[55]  Andrea C. Arpaci-Dusseau,et al.  IRON file systems , 2005, SOSP '05.

[56]  Marvin Theimer,et al.  Managing update conflicts in Bayou, a weakly connected replicated storage system , 1995, SOSP.

[57]  Helen J. Wang,et al.  Enabling Security in Cloud Storage SLAs with CloudProof , 2011, USENIX ATC.

[58]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[59]  Sangmin Lee,et al.  Upright cluster services , 2009, SOSP '09.

[60]  Jeffrey S. Chase,et al.  Strong accountability for network storage , 2007, TOS.