Tracking distributed aggregates over time-based sliding windows

The area of distributed monitoring requires tracking the value of a function of distributed data as new observations are made. An important case is when attention is restricted to only a recent time period, such as the last hour of readings---the sliding window case. In this announcement, we outline a novel paradigm for handling such monitoring problems, which we dub the "forward/backward" approach. This provides clean solutions for several fundamental problems, such as counting, tracking frequent items, and maintaining order statistics. We obtain efficient protocols for these problems that improve on previous work, and are easy to implement. Specifically, we obtain optimal <i>O</i>(<i>k</i>/ε log(ε <i>n</i>/<i>k</i>)) communication per window of <i>n</i> updates for tracking counts and heavy hitters with accuracy ε across <i>k</i> sites; and near-optimal communication of <i>O</i>(<i>k</i>/epsilon log<sup>2</sup>(1/ε) log (<i>n</i>/<i>k</i>)) for quantiles.

[1]  Sanjeev Khanna,et al.  Space-efficient online computation of quantile summaries , 2001, SIGMOD '01.

[2]  Srikanta Tirthapura,et al.  Estimating simple functions on the union of data streams , 2001, SPAA '01.

[3]  Piotr Indyk,et al.  Maintaining stream statistics over sliding windows: (extended abstract) , 2002, SODA '02.

[4]  Piotr Indyk,et al.  Maintaining Stream Statistics over Sliding Windows , 2002, SIAM J. Comput..

[5]  Srikanta Tirthapura,et al.  Distributed Streams Algorithms for Sliding Windows , 2002, SPAA '02.

[6]  Boaz Patt-Shamir A note on efficient aggregate queries in sensor networks , 2004, PODC '04.

[7]  Gurmeet Singh Manku,et al.  Approximate counts and quantiles over sliding windows , 2004, PODS.

[8]  Graham Cormode,et al.  Holistic aggregates in a networked world: distributed tracking of approximate quantiles , 2005, SIGMOD '05.

[9]  Divyakant Agrawal,et al.  Efficient Computation of Frequent and Top-k Elements in Data Streams , 2005, ICDT.

[10]  Graham Cormode,et al.  Sketching Streams Through the Net: Distributed Approximate Query Tracking , 2005, VLDB.

[11]  Graham Cormode,et al.  Communication-efficient distributed monitoring of thresholded counts , 2006, SIGMOD Conference.

[12]  Srikanta Tirthapura,et al.  Sketching asynchronous streams over a sliding window , 2006, PODC '06.

[13]  Timothy M. Chan,et al.  Geometric Optimization Problems over Sliding Windows , 2006, Int. J. Comput. Geom. Appl..

[14]  Lap-Kei Lee,et al.  A simpler and more efficient deterministic scheme for finding frequent items over sliding windows , 2006, PODS '06.

[15]  Graham Cormode,et al.  Algorithms for distributed functional monitoring , 2008, SODA '08.

[16]  Stefan Schmid,et al.  Distributed computation of the mode , 2008, PODC '08.

[17]  Krzysztof Onak,et al.  Sketching and Streaming Entropy via Approximation Theory , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[18]  Chrisil Arackaparambil,et al.  Functional Monitoring without Monotonicity , 2009, ICALP.

[19]  Assaf Schuster,et al.  A Geometric Approach to Monitoring Threshold Functions over Distributed Data Streams , 2010, Ubiquitous Knowledge Discovery.

[20]  Amos Korman,et al.  Efficient threshold detection in a distributed environment: extended abstract , 2010, PODC.

[21]  Y. Emek,et al.  Efficient Threshold Detection in a Distributed Environment , 2010 .

[22]  Qin Zhang,et al.  Optimal sampling from distributed streams , 2010, PODS '10.

[23]  Qin Zhang,et al.  Optimal Tracking of Distributed Heavy Hitters and Quantiles , 2011, Algorithmica.

[24]  Graham Cormode,et al.  Continuous distributed monitoring: a short survey , 2011, AlMoDEP '11.

[25]  Lap-Kei Lee,et al.  Continuous Monitoring of Distributed Data Streams over a Time-Based Sliding Window , 2011, Algorithmica.