Functional Monitoring without Monotonicity

The notion of distributed functional monitoring was recently introduced by Cormode, Muthukrishnan and Yi to initiate a formal study of the communication cost of certain fundamental problems arising in distributed systems, especially sensor networks. In this model, each of k sites reads a stream of tokens and is in communication with a central coordinator, who wishes to continuously monitor some function f of *** , the union of the k streams. The goal is to minimize the number of bits communicated by a protocol that correctly monitors f (*** ), to within some small error. As in previous work, we focus on a threshold version of the problem, where the coordinator's task is simply to maintain a single output bit, which is 0 whenever f (*** ) ≤ *** (1 *** *** ) and 1 whenever f (*** ) *** *** . Following Cormode et al., we term this the (k ,f ,*** ,*** ) functional monitoring problem. In previous work, some upper and lower bounds were obtained for this problem, with f being a frequency moment function, e.g., F 0 , F 1 , F 2 . Importantly, these functions are monotone . Here, we further advance the study of such problems, proving three new classes of results. First, we provide nontrivial monitoring protocols when f is either H , the empirical Shannon entropy of a stream, or any of a related class of entropy functions (Tsallis entropies). These are the first nontrivial algorithms for distributed monitoring of non-monotone functions. Second, we study the effect of non-monotonicity of f on our ability to give nontrivial monitoring protocols, by considering f = F p with deletions allowed, as well as f = H . Third, we prove new lower bounds on this problem when f = F p , for several values of p .

[1]  Krzysztof Onak,et al.  Sketching and Streaming Entropy via Approximation Theory , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[2]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[3]  David P. Woodruff Efficient and private distance approximation in the communication and streaming models , 2007 .

[4]  S. Muthukrishnan Some Algorithmic Problems and Results in Compressed Sensing , 2006 .

[5]  David P. Woodru Ecient and Private Distance Approximation in the Communication and Streaming Models , 2007 .

[6]  Christopher Olston,et al.  Distributed top-k monitoring , 2003, SIGMOD '03.

[7]  Assaf Schuster,et al.  A Geometric Approach to Monitoring Threshold Functions over Distributed Data Streams , 2010, Ubiquitous Knowledge Discovery.

[8]  Jack K. Wolf,et al.  Noiseless coding of correlated information sources , 1973, IEEE Trans. Inf. Theory.

[9]  Sumit Ganguly,et al.  Estimating Entropy over Data Streams , 2006, ESA.

[10]  Satish Kumar,et al.  Next century challenges: scalable coordination in sensor networks , 1999, MobiCom.

[11]  C. Tsallis Possible generalization of Boltzmann-Gibbs statistics , 1988 .

[12]  Graham Cormode,et al.  What’s Different: Distributed, Continuous Monitoring of Duplicate-Resilient Aggregates on Data Streams , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[13]  Abhinandan Das,et al.  Distributed Set Expression Cardinality Estimation , 2004, VLDB.

[14]  Ilan Newman,et al.  Private vs. Common Random Bits in Communication Complexity , 1991, Inf. Process. Lett..

[15]  Qin Zhang,et al.  Multi-dimensional online tracking , 2009, SODA.

[16]  S. Muthukrishnan,et al.  Data streams: algorithms and applications , 2005, SODA '03.

[17]  Graham Cormode,et al.  Algorithms for distributed functional monitoring , 2008, SODA '08.