Gossip-based aggregation in large dynamic networks

As computer networks increase in size, become more heterogeneous and span greater geographic distances, applications must be designed to cope with the very large scale, poor reliability, and often, with the extreme dynamism of the underlying network. Aggregation is a key functional building block for such applications: it refers to a set of functions that provide components of a distributed system access to global information including network size, average load, average uptime, location and description of hotspots, and so on. Local access to global information is often very useful, if not indispensable for building applications that are robust and adaptive. For example, in an industrial control application, some aggregate value reaching a threshold may trigger the execution of certain actions; a distributed storage system will want to know the total available free space; load-balancing protocols may benefit from knowing the target average load so as to minimize the load they transfer. We propose a gossip-based protocol for computing aggregate values over network components in a fully decentralized fashion. The class of aggregate functions we can compute is very broad and includes many useful special cases such as counting, averages, sums, products, and extremal values. The protocol is suitable for extremely large and highly dynamic systems due to its proactive structure---all nodes receive the aggregate value continuously, thus being able to track any changes in the system. The protocol is also extremely lightweight, making it suitable for many distributed applications including peer-to-peer and grid computing systems. We demonstrate the efficiency and robustness of our gossip-based protocol both theoretically and experimentally under a variety of scenarios including node and communication failures.

[1]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[2]  Nancy A. Lynch,et al.  Reaching approximate agreement in the presence of faults , 1986, JACM.

[3]  Scott Shenker,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[4]  Alan Fekete Asynchronous approximate agreement , 1987, PODC '87.

[5]  S. Muthukrishnan,et al.  Dynamic Load Balancing by Random Matchings , 1996, J. Comput. Syst. Sci..

[6]  Robbert van Renesse,et al.  A Gossip-Style Failure Detection Service , 2009 .

[7]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[8]  D. Watts,et al.  Small Worlds: The Dynamics of Networks between Order and Randomness , 2001 .

[9]  Indranil Gupta,et al.  Scalable fault-tolerant aggregation in large process groups , 2001, 2001 International Conference on Dependable Systems and Networks.

[10]  Márk Jelasity,et al.  Large-Scale Newscast Computing on the Internet , 2002 .

[11]  David E. Culler,et al.  Supporting aggregate queries over ad-hoc wireless sensor networks , 2002, Proceedings Fourth IEEE Workshop on Mobile Computing Systems and Applications.

[12]  Ian T. Foster,et al.  Mapping the Gnutella Network , 2002, IEEE Internet Comput..

[13]  Joao Antonio Pereira,et al.  Linked: The new science of networks , 2002 .

[14]  Albert-László Barabási,et al.  Linked: The New Science of Networks , 2002 .

[15]  Robbert van Renesse The Importance of Aggregation , 2003, Future Directions in Distributed Computing.

[16]  Krishna P. Gummadi,et al.  Measuring and analyzing the characteristics of Napster and Gnutella hosts , 2003, Multimedia Systems.

[17]  Trevor Burbridge,et al.  An Adaptive Method for Dynamic Audience Size Estimation in Multicast , 2003, Networked Group Communication.

[18]  Robbert van Renesse,et al.  Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining , 2003, TOCS.

[19]  R. Solé Linked: The New Science of Networks.ByAlbert‐László Barabási.Cambridge (Massachusetts): Perseus Publishing.$26.00. vii + 280 p; ill.; index. ISBN: 0–7382–0667–9. 2002. , 2003 .

[20]  Dahlia Malkhi,et al.  Estimating network size from local information , 2003, Information Processing Letters.

[21]  Márk Jelasity,et al.  A Modular Paradigm for Building Self-Organizing Peer-to-Peer Applications , 2003, Engineering Self-Organising Systems.

[22]  Miroslaw Kutylowski,et al.  Computing Average Value in Ad Hoc Networks , 2003, MFCS.

[23]  Johannes Gehrke,et al.  Gossip-based computation of aggregate information , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[24]  M. Dahlin,et al.  A scalable distributed information management system , 2004, SIGCOMM '04.

[25]  Anne-Marie Kermarrec,et al.  Epidemic information dissemination in distributed systems , 2004, Computer.

[26]  Anne-Marie Kermarrec,et al.  The Peer Sampling Service: Experimental Evaluation of Unstructured Gossip-Based Implementations , 2004, Middleware.

[27]  Sonia Sharama,et al.  Grid Computing , 2004, Lecture Notes in Computer Science.

[28]  Ozalp Babaoglu,et al.  Detection and Removal of Malicious Peers in Gossip-Based Protocols∗ , 2004 .

[29]  Ozalp Babaoglu,et al.  ACM Transactions on Computer Systems , 2007 .