Data transfer scheduling for P2P storage

In Peer-to-Peer storage and backup applications, large amounts of data have to be transferred between nodes. In general, recipient of data transfers are not chosen randomly from the whole set of nodes in the Peer-to-Peer networks, but they are chosen according to peer selection rules imposing several criteria, such as resource contributions, position in DHTs, or trust between nodes. Imposing too stringent restrictions on the choice of nodes that are eligible to receive data can have a negative impact on the amount of time needed to complete data transfer, and scheduling choices influence this result as well. We formalize the problem of data transfer scheduling, and devise means for calculating (knowing a posteriori the availability patterns of nodes) optimal scheduling choices; we then propose and evaluate realistic scheduling policies, and evaluate their overheads in transfer times with respect to the optimal. We show that allowing even a small flexibility in choosing nodes after the peer selection step results in large improvements on time to complete transfers, and that even simple informed scheduling policies can significantly reduce transfer time overhead.

[1]  Brian D. Noble,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Pastiche: Making Backup Cheap and Easy , 2022 .

[2]  Arun Venkataramani,et al.  Do incentives build robustness in bit torrent , 2007 .

[3]  Daniel Stutzbach,et al.  Understanding churn in peer-to-peer networks , 2006, IMC '06.

[4]  Y. Birk,et al.  Coding and Scheduling Considerations for Peer-to-Peer Storage Backup Systems , 2007, Fourth International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI 2007).

[5]  Narendra Karmarkar,et al.  A new polynomial-time algorithm for linear programming , 1984, Comb..

[6]  Michael J. Todd,et al.  Polynomial Algorithms for Linear Programming , 1988 .

[7]  Andrew V. Goldberg,et al.  A new approach to the maximum flow problem , 1986, STOC '86.

[8]  Taoufik En-Najjary,et al.  Long Term Study of Peer Behavior in the kad DHT , 2009, IEEE/ACM Transactions on Networking.

[9]  B. Cohen,et al.  Incentives Build Robustness in Bit-Torrent , 2003 .

[10]  Marc Sánchez Artigas,et al.  Heterogeneity-Aware Erasure Codes for Peer-to-Peer Storage Systems , 2009, 2009 International Conference on Parallel Processing.

[11]  Erwan Le Merrer,et al.  Finding Good Partners in Availability-Aware P2P Networks , 2009, SSS.

[12]  Marc Sánchez Artigas,et al.  Rewarding stability in peer-to-peer backup systems , 2008, 2008 16th IEEE International Conference on Networks.

[13]  L. Khachiyan Polynomial algorithms in linear programming , 1980 .

[14]  Pietro Michiardi,et al.  On Scheduling and Redundancy for P2P Backup , 2010, ArXiv.

[15]  Pietro Michiardi,et al.  Selfish Neighbor Selection in Peer-to-Peer Backup and Storage Applications , 2009, Euro-Par.

[16]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[17]  Refik Molva,et al.  Safebook: A privacy-preserving online social network leveraging on real-life trust , 2009, IEEE Communications Magazine.

[18]  Michael Luby,et al.  LT codes , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[19]  Jinyang Li,et al.  Friendstore: cooperative online backup using trusted nodes , 2008, SocialNets '08.