On the Correctness of Gossip-based Membership Protocols

The importance of scalability and fault-tolerance in modern distributed systems has led to considerable research in multi-cast protocols using gossip. In a gossip protocol, each node forwards messages to a small set of "gossip partners" chosen at random from the entire group membership. By discarding the strong reliability guarantees of traditional protocols in favor of probabilistic guarantees, gossip protocols can deliver greater scalability and fault tolerance. In early gossip algorithms, partners were chosen uniformly at random from the entire membership, limiting scalability because of the resources required to store and maintain complete membership views at each node. Later protocols avoided this issue by storing much smaller random subsets of the membership at each node, and choosing gossip partners only from these local views. Such protocols are subtle: at least some local views must change in response to group membership changes in order to preserve connectivity and performance guarantees. While these protocols have been the subject of much simulation and analysis, formal proofs of key properties---in particular the probability of partitioning---have remained elusive. In this thesis we give a new scalable gossip-based algorithm for local view maintenance, together with a proof that the expected time until a network partition is at least exponential in the view size and the size of the departing set. We develop probabilistic bounds on the in-degree (hence the load) of individual nodes, and argue that protocols lacking our reinforcement component eventually converge to star-like networks, whose connectivity depends on a small set of over-loaded nodes. We argue that the undirected connectivity graph is an expander, for which application-level gossip multi-cast protocols will converge rapidly. An analysis of the membership system under heavy churn yields a lower bound on the amount of communications required per round. Finally, we offer some arguments supporting the experimental fact that the elements of the local views---although not a uniformly random sampling of the set of nodes in the system---have a high degree of randomness and suggesting that the state of the system after O (ln n) iterations is independent of the initial state.

[1]  Indranil Gupta Practical Algorithms for Size Estimation in Large and Dynamic Groups , 2004 .

[2]  Sudipto Guha,et al.  Message Multicasting in Heterogeneous Networks , 2000, SIAM J. Comput..

[3]  Moni Naor,et al.  Viceroy: a scalable and dynamic emulation of the butterfly , 2002, PODC '02.

[4]  Wei Hong,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Tag: a Tiny Aggregation Service for Ad-hoc Sensor Networks , 2022 .

[5]  Martin E. Dyer,et al.  Sampling regular graphs and a peer-to-peer network , 2005, SODA '05.

[6]  Anjali Gupta,et al.  Efficient Routing for Peer-to-Peer Overlays , 2004, NSDI.

[7]  Anne-Marie Kermarrec,et al.  Peer-to-Peer Membership Management for Gossip-Based Protocols , 2003, IEEE Trans. Computers.

[8]  Keith Marzullo,et al.  Directional Gossip: Gossip in a Wide Area Network , 1999, EDCC.

[9]  Patrick Th. Eugster,et al.  Route driven gossip: probabilistic reliable multicast in ad hoc networks , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[10]  Miguel Castro,et al.  Scribe: a large-scale and decentralized application-level multicast infrastructure , 2002, IEEE J. Sel. Areas Commun..

[11]  Keith Marzullo,et al.  Gossip versus Deterministic Flooding: Low Message Overhead and High Reliability for Broadcasting on Small Networks , 1999 .

[12]  M. Jelasity,et al.  T-Man : Fast Gossip-based Construction of Large-Scale Overlay Topologies 1 , 2004 .

[13]  David R. Karger,et al.  Analysis of the evolution of peer-to-peer systems , 2002, PODC '02.

[14]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[15]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[16]  Samuel Madden,et al.  Fjording the stream: an architecture for queries over streaming sensor data , 2002, Proceedings 18th International Conference on Data Engineering.

[17]  Paul D. Ezhilchelvan,et al.  A Survey of Reliable Broadcast Protocols for Mobile Ad-hoc Networks , 2003 .

[18]  Samuel Madden,et al.  Continuously adaptive continuous queries over streams , 2002, SIGMOD '02.

[19]  Eli Upfal,et al.  Building low-diameter P2P networks , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[20]  Satish Kumar,et al.  Next century challenges: scalable coordination in sensor networks , 1999, MobiCom.

[21]  Robbert van Renesse,et al.  Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining , 2003, TOCS.

[22]  Robbert van Renesse,et al.  Scalable and Secure Resource Location , 2000, HICSS.

[23]  J. Steele,et al.  RANDOM EXCHANGES OF INFORMATION , 1979 .

[24]  Kai Shen,et al.  Structure Management for Scalable Overlay Service Construction , 2004, NSDI.

[25]  Alan M. Frieze,et al.  The shortest-path problem for graphs with random arc-lengths , 1985, Discret. Appl. Math..

[26]  Anne-Marie Kermarrec,et al.  Probabilistic Reliable Dissemination in Large-Scale Systems , 2003, IEEE Trans. Parallel Distributed Syst..

[27]  B. Pittel On spreading a rumor , 1987 .

[28]  Joseph Y. Halpern,et al.  Gossip-based ad hoc routing , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[29]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[30]  Indranil Gupta,et al.  Kelips: Building an Efficient and Stable P2P DHT through Increased Memory and Background Overhead , 2003, IPTPS.

[31]  Robbert van Renesse,et al.  A Gossip-Style Failure Detection Service , 2009 .

[32]  Helen J. Wang,et al.  An evaluation of scalable application-level multicast built using peer-to-peer overlays , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[33]  Jon M. Kleinberg,et al.  Spatial gossip and resource location protocols , 2001, JACM.

[34]  Eli Upfal,et al.  Randomized Broadcast in Networks , 1990, Random Struct. Algorithms.

[35]  Anne-Marie Kermarrec,et al.  HiScamp: self-organizing hierarchical membership protocol , 2002, EW 10.

[36]  Anne-Marie Kermarrec,et al.  Network awareness and failure resilience in self-organizing overlay networks , 2003, 22nd International Symposium on Reliable Distributed Systems, 2003. Proceedings..

[37]  Eli Upfal,et al.  The Complexity of Parallel Search , 1988, J. Comput. Syst. Sci..

[38]  Robert Tappan Morris,et al.  Comparing the Performance of Distributed Hash Tables Under Churn , 2004, IPTPS.

[39]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[40]  James Aspnes,et al.  Fault-tolerant routing in peer-to-peer systems , 2002, PODC '02.

[41]  Devdatt P. Dubhashi,et al.  Balls and bins: A study in negative dependence , 1996, Random Struct. Algorithms.

[42]  Ian Clarke,et al.  Freenet: A Distributed Anonymous Information Storage and Retrieval System , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[43]  Anne-Marie Kermarrec,et al.  The Peer Sampling Service: Experimental Evaluation of Unstructured Gossip-Based Implementations , 2004, Middleware.

[44]  Deborah Estrin,et al.  Habitat monitoring: application driver for wireless communications technology , 2001, CCRV.

[45]  R. Ravi,et al.  Rapid rumor ramification: approximating the minimum broadcast time , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[46]  Anjali Gupta,et al.  One Hop Lookups for Peer-to-Peer Overlays , 2003, HotOS.

[47]  Stefan Saroiu,et al.  Dynamically Fault-Tolerant Content Addressable Networks , 2002, IPTPS.

[48]  Kenneth P. Birman,et al.  Bimodal multicast , 1999, TOCS.

[49]  Peter Druschel,et al.  Exploiting network proximity in peer-to-peer overlay networks , 2002 .

[50]  Richard M. Karp,et al.  Randomized rumor spreading , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[51]  F. Chung Laplacians and the Cheeger Inequality for Directed Graphs , 2005 .

[52]  Rajeev Motwani,et al.  Randomized Algorithms , 1995, SIGA.

[53]  Seif Haridi,et al.  A Statistical Theory of Chord Under Churn , 2005, IPTPS.

[54]  Newscast Computing , 2003 .

[55]  Jon Crowcroft,et al.  A survey and comparison of peer-to-peer overlay network schemes , 2005, IEEE Communications Surveys & Tutorials.

[56]  Doug Terry,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[57]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[58]  Deborah Estrin,et al.  Next Century Challenges: Mobile Networking for Smart Dust , 1999, MobiCom 1999.

[59]  Maarten van Steen,et al.  CYCLON: Inexpensive Membership Management for Unstructured P2P Overlays , 2005, Journal of Network and Systems Management.

[60]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[61]  Anne-Marie Kermarrec,et al.  Efficient epidemic-style protocols for reliable and scalable multicast , 2002, 21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings..

[62]  Idit Keidar,et al.  Araneola: a scalable reliable multicast system for dynamic environments , 2004, Third IEEE International Symposium on Network Computing and Applications, 2004. (NCA 2004). Proceedings..

[63]  Jon M. Kleinberg,et al.  Protocols and impossibility results for gossip-based communication mechanisms , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[64]  Joan Feigenbaum,et al.  Sharing the Cost of Multicast Transmissions , 2001, J. Comput. Syst. Sci..

[65]  Deborah Estrin,et al.  Impact of network density on data aggregation in wireless sensor networks , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[66]  Indranil Gupta,et al.  Fighting fire with fire: using randomized gossip to combat stochastic scalability limits , 2002 .

[67]  Péter Urbán,et al.  Totally Ordered Broadcast and Multicast Algorithms: A Comprehensive Survey , 2000 .

[68]  Rodrigo Rodrigues,et al.  When Multi-Hop Peer-to-Peer Routing Matters , 2003 .

[69]  Arthur L. Liestman,et al.  A survey of gossiping and broadcasting in communication networks , 1988, Networks.

[70]  Divyakant Agrawal,et al.  Epidemic algorithms in replicated databases (extended abstract) , 1997, PODS.

[71]  David Mazières,et al.  Kademlia: A Peer-to-Peer Information System Based on the XOR Metric , 2002, IPTPS.

[72]  Miguel Castro,et al.  Scalable Application-Level Anycast for Highly Dynamic Groups , 2003, Networked Group Communication.

[73]  Anne-Marie Kermarrec,et al.  Lightweight probabilistic broadcast , 2003, TOCS.

[74]  Jun Xu,et al.  On the fundamental tradeoffs between routing table size and network diameter in peer-to-peer networks , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[75]  Indranil Gupta,et al.  Scalable fault-tolerant aggregation in large process groups , 2001, 2001 International Conference on Dependable Systems and Networks.

[76]  Christian Schindelhauer,et al.  Peer-to-peer networks based on random transformations of connected regular undirected graphs , 2005, SPAA '05.