Proceedings of the 5th Symposium on Operating Systems Design and Implementation Pastiche: Making Backup Cheap and Easy

Backup is cumbersome and expensive. Individual users almost never back up their data, and backup is a significant cost in large organizations. This paper presents Pastiche, a simple and inexpensive backup system. Pastiche exploits excess disk capacity to perform peer-to-peer backup with no administrative costs. Each node minimizes storage overhead by selecting peers that share a significant amount of data. It is easy for common installations to find suitable peers, and peers with high overlap can be identified with only hundreds of bytes. Pastiche provides mechanisms for confidentiality, integrity, and detection of failed or malicious peers. A Pastiche prototype suffers only 7.4% overhead for a modified Andrew Benchmark, and restore performance is comparable to cross-machine copy.

[1]  W. W. Peterson,et al.  Error-Correcting Codes. , 1962 .

[2]  David Chaum,et al.  Blind Signatures for Untraceable Payments , 1982, CRYPTO.

[3]  Steve R. Kleiman,et al.  Vnodes: An Architecture for Multiple File System Types in Sun UNIX , 1986, USENIX Summer.

[4]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1987, SOSP '87.

[5]  Sean Quinlan,et al.  A cached WORM file system , 1991, Softw. Pract. Exp..

[6]  James Lau,et al.  File System Design for an NFS File Server Appliance , 1994, USENIX Winter.

[7]  Udi Manber,et al.  Finding Similar Files in a Large File System , 1994, USENIX Winter.

[8]  Mahadev Satyanarayanan,et al.  An empirical study of a wide-area distributed file system , 1996, TOCS.

[9]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[10]  Geoffrey Zweig,et al.  Syntactic Clustering of the Web , 1997, Comput. Networks.

[11]  Robert Grimm,et al.  Application performance and flexibility on exokernel systems , 1997, SOSP.

[12]  W. Curtis Preston,et al.  Using Gigabit Ethernet to Backup Six Terabytes , 1998, LISA.

[13]  J. Doug Tygar,et al.  Atomicity versus Anonymity: Distributed Transactions for Electronic Commerce , 1998, VLDB.

[14]  Joan Daemen,et al.  AES Proposal : Rijndael , 1998 .

[15]  Assar Westerlund,et al.  Arla: a free AFS client , 1998 .

[16]  A. Chervenak,et al.  Protecting File Systems : A Survey of Backup Techniques , 1998 .

[17]  Andrew Tridgell,et al.  Efficient Algorithms for Sorting and Synchronization , 1999 .

[18]  Ari Juels,et al.  $evwu Dfw , 1998 .

[19]  Werner Vogels,et al.  File system usage in Windows NT 4.0 , 1999, SOSP.

[20]  Norman C. Hutchinson,et al.  Deciding when to forget in the Elephant file system , 1999, SOSP.

[21]  Peter Druschel,et al.  Resource containers: a new facility for resource management in server systems , 1999, OSDI '99.

[22]  Eric Melski Burt: The Backup and Recovery Tool , 1999, LISA.

[23]  William J. Bolosky,et al.  A large-scale study of file-system contents , 1999, SIGMETRICS '99.

[24]  William J. Bolosky,et al.  Progress-based regulation of low-importance processes , 1999, SOSP.

[25]  William J. Bolosky,et al.  Single instance storage in Windows® 2000 , 2000 .

[26]  W. Vogels File system usage in Windows NT 4.0 , 2000, OPSR.

[27]  William J. Bolosky,et al.  Single Instance Storage in Windows , 2000 .

[28]  Marvin Theimer,et al.  Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs , 2000, SIGMETRICS '00.

[29]  Angelos D. Keromytis,et al.  Offline Micropayments without Trusted Hardware , 2002, Financial Cryptography.

[30]  MaziéresDavid,et al.  A low-bandwidth network file system , 2001 .

[31]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[32]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[33]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[34]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[35]  Ben Y. Zhao,et al.  Maintenance-Free Global Data Storage , 2001, IEEE Internet Comput..

[36]  David Mazières,et al.  A low-bandwidth network file system , 2001, SOSP.

[37]  Ronald Fagin,et al.  Compactly encoding unstructured inputs with differential compression , 2002, JACM.

[38]  John R. Douceur,et al.  The Sybil Attack , 2002, IPTPS.

[39]  Ian Clarke,et al.  Protecting Free Expression Online with Freenet , 2002, IEEE Internet Comput..

[40]  Steve R. Kleiman,et al.  SnapMirror: File-System-Based Asynchronous Mirroring for Disaster Recovery , 2002, FAST.

[41]  Peter Druschel,et al.  Exploiting network proximity in peer-to-peer overlay networks , 2002 .

[42]  Sean Quinlan,et al.  Venti: A New Approach to Archival Storage , 2002, FAST.

[43]  Sharon E. Perl,et al.  Myriad: Cost-Effective Disaster Tolerance , 2002, FAST.

[44]  Miguel Castro,et al.  Secure routing for structured peer-to-peer overlay networks , 2002, OSDI '02.

[45]  Michael Isard,et al.  A Cooperative Backup System , 2003 .

[46]  Craig A. N. Soules,et al.  Self-securing storage: protecting data in compromised systems , 2000, Foundations of Intrusion Tolerant Systems, 2003 [Organically Assured and Survivable Information Systems].

[47]  Miguel Castro,et al.  Security for Structured Peer-to-peer Overlay Networks , 2004 .