Tracking the 𝓁2 Norm with Constant Update Time

The `2 tracking problem is the task of obtaining a streaming algorithm that, given access to a stream of items a1, a2, a3, . . . from a universe [n], outputs at each time t an estimate to the `2 norm of the frequency vector f (t) ∈ R (where f (t) i is the number of occurrences of item i in the stream up to time t). The previous work [Braverman-Chestnut-Ivkin-Nelson-Wang-Woodruff, PODS 2017] gave a streaming algorithm with (the optimal) space using O( −2 log(1/δ)) words and O( −2 log(1/δ)) update time to obtain an -accurate estimate with probability at least 1 − δ. We give the first algorithm that achieves update time of O(log 1/δ) which is independent of the accuracy parameter , together with the nearly optimal space using O( −2 log(1/δ)) words. Our algorithm is obtained using the Count Sketch of [Charilkar-Chen-Farach-Colton, ICALP 2002]. 2012 ACM Subject Classification Theory of computation → Sketching and sampling

[1]  Mikkel Thorup,et al.  Tabulation-Based 5-Independent Hashing with Applications to Linear Probing and Second Moment Estimation , 2012, SIAM J. Comput..

[2]  Balachander Krishnamurthy,et al.  Sketch-based change detection: methods, evaluation, and applications , 2003, IMC '03.

[3]  F. T. Wright,et al.  A Bound on Tail Probabilities for Quadratic Forms in Independent Random Variables , 1971 .

[4]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[5]  David P. Woodruff,et al.  Beating CountSketch for heavy hitters in insertion streams , 2015, STOC.

[6]  Tadeusz Inglot,et al.  Asymptotic optimality of new adaptive test in regression model , 2006 .

[7]  Kasper Green Larsen,et al.  Time Lower Bounds for Nonadaptive Turnstile Streaming Algorithms , 2014, STOC.

[8]  Daniel M. Kane,et al.  Sparser Johnson-Lindenstrauss Transforms , 2010, JACM.

[9]  Daniel M. Kane,et al.  Almost Optimal Explicit Johnson-Lindenstrauss Families , 2011, APPROX-RANDOM.

[10]  Mikkel Thorup,et al.  Tabulation based 4-universal hashing with applications to second moment estimation , 2004, SODA '04.

[11]  A. C. Berry The accuracy of the Gaussian approximation to the sum of independent variates , 1941 .

[12]  Noga Alon,et al.  The Space Complexity of Approximating the Frequency Moments , 1999 .

[13]  David P. Woodruff,et al.  An optimal algorithm for the distinct elements problem , 2010, PODS '10.

[14]  David P. Woodruff,et al.  Fast moment estimation in data streams in optimal space , 2010, STOC '11.

[15]  Ke Yi,et al.  Tracking the Frequency Moments at All Times , 2014, ArXiv.

[16]  Moses Charikar,et al.  Finding frequent items in data streams , 2004, Theor. Comput. Sci..

[17]  David P. Woodruff,et al.  On the exact space complexity of sketching and streaming small norms , 2010, SODA '10.

[18]  Jian Ding,et al.  Continuous monitoring of 𝓁p norms in data streams , 2017, APPROX-RANDOM.

[19]  Anirban Dasgupta,et al.  A sparse Johnson: Lindenstrauss transform , 2010, STOC '10.

[20]  David P. Woodruff,et al.  Optimal Bounds for Johnson-Lindenstrauss Transforms and Streaming Problems with Subconstant Error , 2011, TALG.

[21]  David P. Woodruff,et al.  BPTree: An ℓ2 Heavy Hitters Algorithm Using Constant Memory , 2016, PODS.