Self-organized Fault-tolerant Routing in Peer-to-Peer Overlays

In sufficiently large heterogeneous overlays message loss and delays are likely to occur. This has a significant impact on overlay routing, especially on longer paths. The existing solutions to this problem rely on message redundancy to mask the loss and delays. This incurs a significant bandwidth cost. We propose the Forward Feedback Protocol (FFP) which only routes a single copy of the message and detects the message loss and excessive delays while routing. Failures are signaled along the routing paths. Based only on the simple binary signals, each overlay node locally and independently learns to route to avoid failures. The local node interactions lead to the emergence of fast reliable overlay routes. This is a continuous process, the system constantly self-organizes in response to changing delay and loss conditions. We evaluate the protocol in the Internet deployment and in simulation. Our system uses 2-5 times less bandwidth than the existing overlay routing approaches that rely on high message redundancy for fault-tolerance. Despite its marginal bandwidth investment in reliability, FFP achieves up to a $30\%$ higher delivery success rate in comparison to the existing solutions. The protocol is scalable with local state size of $O(\log^2 N )$ in terms of the network size and is universally applicable to all recursively routing overlays.

[1]  Daniel Stutzbach,et al.  Understanding churn in peer-to-peer networks , 2006, IMC '06.

[2]  John Kubiatowicz,et al.  Handling churn in a DHT , 2004 .

[3]  Robert Tappan Morris,et al.  Comparing the Performance of Distributed Hash Tables Under Churn , 2004, IPTPS.

[4]  Wolfgang Kellerer,et al.  Authentication-free fault-tolerant peer-to-peer service provisioning , 2007 .

[5]  Karl Aberer,et al.  Managing trust in a peer-2-peer information system , 2001, CIKM '01.

[6]  Robert Tappan Morris,et al.  Designing a DHT for Low Latency and High Throughput , 2004, NSDI.

[7]  Robert Tappan Morris,et al.  Vivaldi: a decentralized network coordinate system , 2004, SIGCOMM '04.

[8]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[9]  David Mazières,et al.  Kademlia: A Peer-to-Peer Information System Based on the XOR Metric , 2002, IPTPS.

[10]  Antonio F. Gómez-Skarmeta,et al.  A novel methodology for constructing secure multipath overlays , 2005, IEEE Internet Computing.

[11]  Hector Garcia-Molina,et al.  The Eigentrust algorithm for reputation management in P2P networks , 2003, WWW '03.

[12]  Karl Aberer,et al.  P-Grid: a self-organizing structured P2P system , 2003, SGMD.

[13]  Karl Aberer,et al.  Improving the Throughput of Distributed Hash Tables Using Congestion-Aware Routing , 2007, IPTPS.

[14]  Emin Gün Sirer,et al.  Meridian: a lightweight network location service without virtual coordinates , 2005, SIGCOMM '05.

[15]  Hari Balakrishnan,et al.  Best-path vs. multi-path overlay routing , 2003, IMC '03.

[16]  Ion Stoica,et al.  Non-Transitive Connectivity and DHTs , 2005, WORLDS.

[17]  David R. Karger,et al.  Chord: a scalable peer-to-peer lookup protocol for internet applications , 2003, TNET.

[18]  Fabián E. Bustamante,et al.  Friendships that Last: Peer Lifespan and its Role in P2P Protocols , 2003, WCW.

[19]  Ling Liu,et al.  PeerTrust: supporting reputation-based trust for peer-to-peer electronic communities , 2004, IEEE Transactions on Knowledge and Data Engineering.

[20]  Aleksandar Kuzmanovic,et al.  Denial-of-service resilience in peer-to-peer file sharing systems , 2005, SIGMETRICS '05.

[21]  Scott Shenker,et al.  Fixing the Embarrassing Slowness of OpenDHT on PlanetLab , 2005, WORLDS.

[22]  Peter G. Kropf,et al.  Adaptive Load Balancing for DHT Lookups , 2006, Proceedings of 15th International Conference on Computer Communications and Networks.

[23]  Ben Y. Zhao,et al.  Tapestry: a fault-tolerant wide-area application infrastructure , 2002, CCRV.

[24]  David Mazières,et al.  Democratizing Content Publication with Coral , 2004, NSDI.

[25]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[26]  Marco Dorigo,et al.  AntNet: Distributed Stigmergetic Control for Communications Networks , 1998, J. Artif. Intell. Res..

[27]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[28]  Miguel Castro,et al.  Secure routing for structured peer-to-peer overlay networks , 2002, OSDI '02.