On Scheduling and Redundancy for P2P Backup

An online backup system should be quick and reliable in both saving and restoring users' data. To do so in a peer-to-peer implementation, data transfer scheduling and the amount of redundancy must be chosen wisely. We formalize the problem of exchanging multiple pieces of data with intermittently available peers, and we show that random scheduling completes transfers nearly optimally in terms of duration as long as the system is sufficiently large. Moreover, we propose an adaptive redundancy scheme that improves performance and decreases resource usage while keeping the risks of data loss low. Extensive simulations show that our techniques are effective in a realistic trace-driven scenario with heterogeneous bandwidth.

[1]  Himabindu Pucha,et al.  Efficient Similarity Estimation for Systems Exploiting Data Redundancy , 2010, 2010 Proceedings IEEE INFOCOM.

[2]  Kian-Lee Tan,et al.  PeerStore: better performance by relaxing in peer-to-peer backup , 2004 .

[3]  Kian-Lee Tan,et al.  PeerStore: better performance by relaxing in peer-to-peer backup , 2004, Proceedings. Fourth International Conference on Peer-to-Peer Computing, 2004. Proceedings..

[4]  Roberto Di Pietro,et al.  Scalable and efficient provable data possession , 2008, IACR Cryptol. ePrint Arch..

[5]  Hector Garcia-Molina,et al.  EigenRep: Reputation Management in P2P Networks , 2003 .

[6]  Andrew V. Goldberg,et al.  A new approach to the maximum flow problem , 1986, STOC '86.

[7]  Brian D. Noble,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Pastiche: Making Backup Cheap and Easy , 2022 .

[8]  Marc Sánchez Artigas,et al.  Rewarding stability in peer-to-peer backup systems , 2008, 2008 16th IEEE International Conference on Networks.

[9]  Pedro García López,et al.  Maintaining data reliability without availability in P2P storage systems , 2010, SAC '10.

[10]  Stefan Savage,et al.  Total Recall: System Support for Automated Availability Management , 2004, NSDI.

[11]  Y. Birk,et al.  Coding and Scheduling Considerations for Peer-to-Peer Storage Backup Systems , 2007, Fourth International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI 2007).

[12]  Pietro Michiardi,et al.  Online Data Backup: A Peer-Assisted Approach , 2010, 2010 IEEE Tenth International Conference on Peer-to-Peer Computing (P2P).

[13]  Michael Burrows,et al.  A Cooperative Internet Backup Scheme , 2003, USENIX Annual Technical Conference, General Track.

[14]  Ernst W. Biersack,et al.  Hierarchical Codes: How to Make Erasure Codes Attractive for Peer-to-Peer Storage Systems , 2008, 2008 Eighth International Conference on Peer-to-Peer Computing.

[15]  Arun Venkataramani,et al.  Do Incentives Build Robustness in BitTorrent? (Awarded Best Student Paper) , 2007, NSDI.

[16]  Bianca Schroeder,et al.  Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You? , 2007, FAST.

[17]  Pietro Michiardi,et al.  Selfish Neighbor Selection in Peer-to-Peer Backup and Storage Applications , 2009, Euro-Par.

[18]  Stefan Savage,et al.  Understanding Availability , 2003, IPTPS.

[19]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[20]  Melek Önen,et al.  A Security Protocol for Self-Organizing Data Storage , 2008, SEC.

[21]  Andreas Haeberlen,et al.  Efficient Replica Maintenance for Distributed Storage Systems , 2006, NSDI.

[22]  Emin Gün Sirer,et al.  KARMA : A Secure Economic Framework for Peer-to-Peer Resource Sharing , 2003 .

[23]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.