Pond: The OceanStore Prototype

OceanStore is an Internet-scale, persistent data store designed for incremental scalability, secure sharing, and long-term durability. Pond is the OceanStore prototype; it contains many of the features of a complete system including location-independent routing, Byzantine update commitment, push-based update of cached copies through an overlay multicast network, and continuous archiving to erasure-coded form. In the wide area, Pond outperforms NFS by up to a factor of 4.6 on readintensive phases of the Andrew benchmark, but underperforms NFS by as much as a factor of 7.3 on writeintensive phases. Microbenchmarks show that write performance is limited by the speed of erasure coding and threshold signature generation, two important areas of future research. Further microbenchmarks show that Pond manages replica consistency in a bandwidth-efficient manner and quantify the latency cost imposed by this bandwidth savings.

[1]  Ralph C. Merkle,et al.  A Digital Signature Based on a Conventional Encryption Function , 1987, CRYPTO.

[2]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1987, SOSP '87.

[3]  Michael Stonebraker,et al.  The Design of the POSTGRES Storage System , 1988, VLDB.

[4]  Scale and performance in a distributed file system , 1988, TOCS.

[5]  Michael N. Nelson,et al.  Caching in the Sprite network file system , 1988, TOCS.

[6]  Mahadev Satyanarayanan,et al.  Scalable, secure, and highly available distributed file access , 1990, Computer.

[7]  Marvin Theimer,et al.  The Bayou Architecture: Support for Data Sharing Among Mobile Users , 1994, 1994 First Workshop on Mobile Computing Systems and Applications.

[8]  Mahadev Satyanarayanan,et al.  Disconnected Operation in the Coda File System , 1999, Mobidata.

[9]  Marek Karpinski,et al.  An XOR-based erasure-resilient coding scheme , 1995 .

[10]  Dennis Shasha,et al.  The dangers of replication and a solution , 1996, SIGMOD '96.

[11]  Ellen W. Zegura,et al.  How to model an internetwork , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[12]  R. Anderson The Eternity Service , 1996 .

[13]  Martín Abadi,et al.  On SDSI's linked local name spaces , 1997, Proceedings 10th Computer Security Foundations Workshop.

[14]  Miron Livny,et al.  Transactional client-server cache consistency: alternatives and performance , 1997, TODS.

[15]  Andrew V. Goldberg,et al.  Towards an archival Intermemory , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[16]  Tal Rabin,et al.  A Simplified Approach to Threshold and Proactive RSA , 1998, CRYPTO.

[17]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[18]  Norman C. Hutchinson,et al.  Deciding when to forget in the Elephant file system , 1999, SOSP.

[19]  Pradeep K. Khosla,et al.  Survivable Information Storage Systems , 2000, Computer.

[20]  Aviel D. Rubin,et al.  Publius: a robust, tamper-evident, censorship-resistant web publishing system , 2000 .

[21]  Miguel Castro,et al.  Proactive recovery in a Byzantine-fault-tolerant system , 2000, OSDI.

[22]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[23]  Ian Clarke,et al.  Freenet: A Distributed Anonymous Information Storage and Retrieval System , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[24]  Marvin Theimer,et al.  Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs , 2000, SIGMETRICS '00.

[25]  Victor Shoup,et al.  Practical Threshold Signatures , 2000, EUROCRYPT.

[26]  Dan Boneh,et al.  Building intrusion tolerant applications , 1999, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[27]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[28]  MaziéresDavid,et al.  A low-bandwidth network file system , 2001 .

[29]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[30]  David Mazières,et al.  A Toolkit for User-Level File Systems , 2001, USENIX Annual Technical Conference, General Track.

[31]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[32]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[33]  Michael K. Reiter,et al.  Persistent objects in the Fleet system , 2001, Proceedings DARPA Information Survivability Conference and Exposition II. DISCEX'01.

[34]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[35]  V. T. Rajan,et al.  Java without the coffee breaks: a nonintrusive multiprocessor garbage collector , 2001, PLDI '01.

[36]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[37]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[38]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[39]  David E. Culler,et al.  SEDA: an architecture for well-conditioned, scalable internet services , 2001, SOSP.

[40]  David E. Culler,et al.  SEDA: An Architecture for Scalable, Well-Conditioned Internet Services , 2001 .

[41]  Moni Naor,et al.  Viceroy: a scalable and dynamic emulation of the butterfly , 2002, PODC '02.

[42]  John Kubiatowicz,et al.  Erasure Coding Vs. Replication: A Quantitative Comparison , 2002, IPTPS.

[43]  John Kubiatowicz,et al.  Efficient heartbeats and repair of softstate in decentralized object location and routing systems , 2002, EW 10.

[44]  Magnus Karlsson,et al.  Taming aggressive replication in the Pangaea wide-area file system , 2002, OPSR.

[45]  John Kubiatowicz,et al.  Probabilistic location and routing , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[46]  Ben Y. Zhao,et al.  Distributed Object Location in a Dynamic Network , 2002, SPAA '02.

[47]  Randy H. Katz,et al.  SCAN: A Dynamic, Scalable, and Efficient Content Distribution Network , 2002, Pervasive.

[48]  Ben Y. Zhao,et al.  Distributed Data Location in a Dynamic Network , 2002 .

[49]  Timothy Roscoe,et al.  Mnemosyne: Peer-to-Peer Steganographic Storage , 2002, IPTPS.

[50]  Randy H. Katz,et al.  Dynamic Replica Placement for Scalable Content Delivery , 2002, IPTPS.

[51]  David Mazières,et al.  Kademlia: A Peer-to-Peer Information System Based on the XOR Metric , 2002, IPTPS.

[52]  Robert Tappan Morris,et al.  Ivy: a read/write peer-to-peer file system , 2002, OSDI '02.

[53]  John Kubiatowicz,et al.  Introspective failure analysis: avoiding correlated failures in peer-to-peer systems , 2002, 21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings..

[54]  Robbert van Renesse,et al.  COCA: a secure distributed online certification authority , 2002, Foundations of Intrusion Tolerant Systems, 2003 [Organically Assured and Survivable Information Systems].

[55]  J. Kubiatowicz,et al.  Naming and Integrity: Self-verifying Data in Peer-to-Peer Systems , 2003, Future Directions in Distributed Computing.