The Complexity of Data Aggregation in Directed Networks

We study problems of data aggregation, such as approximate counting and computing the minimum input value, in synchronous directed networks with bounded message bandwidth B = Ω(log n). In undirected networks of diameter D, many such problems can easily be solved in O(D) rounds, using O(log n)- size messages. We show that for directed networks this is not the case: when the bandwidth B is small, several classical data aggregation problems have a time complexity that depends polynomially on the size of the network, even when the diameter of the network is constant. We show that computing an e-approximation to the size n of the network requires Ω(min{n, 1/e2}/B) rounds, even in networks of diameter 2. We also show that computing a sensitive function (e.g., minimum and maximum) requires Ω(√n/B) rounds in networks of diameter 2, provided that the diameter is not known in advance to be o(√n/B). Our lower bounds are established by reduction from several well-known problems in communication complexity. On the positive side, we give a nearly optimal O(D+√n/B)-round algorithm for computing simple sensitive functions using messages of size B = Ω(logN), where N is a loose upper bound on the size of the network and D is the diameter.

[1]  Bala Kalyanasundaram,et al.  The Probabilistic Communication Complexity of Set Intersection , 1992, SIAM J. Discret. Math..

[2]  Nicola Santoro,et al.  On the Expected Complexity of Distributed Selection , 1988, J. Parallel Distributed Comput..

[3]  Devavrat Shah,et al.  Fast Distributed Algorithms for Computing Separable Functions , 2005, IEEE Transactions on Information Theory.

[4]  Baruch Awerbuch,et al.  Optimal distributed algorithms for minimum weight spanning tree, counting, leader election, and related problems , 1987, STOC.

[5]  Nicola Santoro,et al.  Efficient Distributed Selection with Bounded Messages , 1997, IEEE Trans. Parallel Distributed Syst..

[6]  Amit Chakrabarti,et al.  An optimal lower bound on the communication complexity of gap-hamming-distance , 2010, STOC '11.

[7]  Roger Wattenhofer,et al.  Tight bounds for distributed selection , 2007, SPAA '07.

[8]  E. Kushilevitz,et al.  Communication Complexity: Basics , 1996 .

[9]  David P. Woodruff,et al.  Tight lower bounds for the distinct elements problem , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[10]  Stefan Schmid,et al.  Distributed computation of the mode , 2008, PODC '08.

[11]  Alexander A. Razborov,et al.  On the Distributional Complexity of Disjointness , 1992, Theor. Comput. Sci..

[12]  W. Marsden I and J , 2012 .

[13]  Nicola Santoro,et al.  On the Expected Complexity of Distributed Selection , 1988, J. Parallel Distributed Comput..

[14]  Amit Chakrabarti,et al.  An Optimal Lower Bound on the Communication Complexity of Gap-Hamming-Distance , 2012, SIAM J. Comput..

[15]  Michael Rodeh,et al.  Distributed k-selection: From a sequential to a distributed algorithm , 1983, PODC '83.

[16]  Boaz Patt-Shamir A note on efficient aggregate queries in sensor networks , 2007, Theor. Comput. Sci..

[17]  Donald M. Topkis,et al.  Concurrent Broadcast for Information Dissemination , 1985, IEEE Transactions on Software Engineering.

[18]  Johannes Gehrke,et al.  Gossip-based computation of aggregate information , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[19]  Nancy A. Lynch,et al.  Distributed computation in dynamic networks , 2010, STOC '10.

[20]  Nicola Santoro,et al.  A Distributed Selection Algorithm and its Expected Communication Complexity , 1992, Theor. Comput. Sci..

[21]  Nancy A. Lynch,et al.  The impact of synchronous communication on the problem of electing a leader in a ring , 1984, STOC '84.