A Framework for Adversarially Robust Streaming Algorithms

We investigate the adversarial robustness of streaming algorithms. In this context, an algorithm is considered robust if its performance guarantees hold even if the stream is chosen adaptively by an adversary that observes the outputs of the algorithm along the stream and can react in an online manner. While deterministic streaming algorithms are inherently robust, many central problems in the streaming literature do not admit sublinear-space deterministic algorithms; on the other hand, classical space-efficient randomized algorithms for these problems are generally not adversarially robust. This raises the natural question of whether there exist efficient adversarially robust (randomized) streaming algorithms for these problems. In this work, we show that the answer is positive for various important streaming problems in the insertion-only model, including distinct elements and more generally $F_p$-estimation, Fp-heavy hitters, entropy estimation, and others. For all of these problems, we develop adversarially robust (1+ε)-approximation algorithms whose required space matches that of the best known non-robust algorithms up to a poly(log n, 1/ε) multiplicative factor (and in some cases even up to a constant factor). Towards this end, we develop several generic tools allowing one to efficiently transform a non-robust streaming algorithm into a robust one in various scenarios.

[1]  David P. Woodruff,et al.  Towards Optimal Moment Estimation in Streaming and Distributed Models , 2019, APPROX-RANDOM.

[2]  Eylon Yogev,et al.  The Adversarial Robustness of Sampling , 2019, IACR Cryptol. ePrint Arch..

[3]  David P. Woodruff,et al.  High Probability Frequency Moment Sketches , 2018, ICALP.

[4]  David P. Woodruff,et al.  Data Streams with Bounded Deletions , 2018, PODS.

[5]  Jaroslaw Blasiok,et al.  Optimal Streaming and Tracking Distinct Elements with High Probability , 2018, SODA.

[6]  Jelani Nelson,et al.  Continuous monitoring of $\ell_p$ norms in data streams , 2017 .

[7]  Amit Chakrabarti,et al.  Strong Fooling Sets for Multi-player Communication with Applications to Deterministic Estimation of Stream Statistics , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[8]  David P. Woodruff,et al.  BPTree: An ℓ2 Heavy Hitters Algorithm Using Constant Memory , 2016, PODS.

[9]  Moni Naor,et al.  Bloom Filters in Adversarial Environments , 2014, CRYPTO.

[10]  David P. Woodruff,et al.  A Tight Lower Bound for High Frequency Moment Estimation with Small Error , 2013, APPROX-RANDOM.

[11]  Peter Clifford,et al.  A simple sketching algorithm for entropy estimation over streaming data , 2013, AISTATS.

[12]  David P. Woodruff,et al.  How robust are linear sketches to adaptive inputs? , 2012, STOC '13.

[13]  David P. Woodruff,et al.  Reusable low-error compressive sampling schemes through privacy , 2012, 2012 IEEE Statistical Signal Processing Workshop (SSP).

[14]  Sudipto Guha,et al.  Graph sketches: sparsification, spanners, and subgraphs , 2012, PODS.

[15]  Atri Rudra,et al.  Recovering simple signals , 2012, 2012 Information Theory and Applications Workshop.

[16]  Sudipto Guha,et al.  Analyzing graph structure via linear measurements , 2012, SODA.

[17]  David P. Woodruff,et al.  Optimal Bounds for Johnson-Lindenstrauss Transforms and Streaming Problems with Subconstant Error , 2011, TALG.

[18]  David P. Woodruff,et al.  An optimal algorithm for the distinct elements problem , 2010, PODS '10.

[19]  David P. Woodruff,et al.  On the exact space complexity of sketching and streaming small norms , 2010, SODA '10.

[20]  David P. Woodruff,et al.  The Data Stream Space Complexity of Cascaded Norms , 2009, 2009 50th Annual IEEE Symposium on Foundations of Computer Science.

[21]  Sumit Ganguly,et al.  Deterministically Estimating Data Stream Frequencies , 2009, COCOA.

[22]  C. Cobeli,et al.  ON THE DISCRETE LOGARITHM PROBLEM , 2008, 0811.4182.

[23]  Moni Naor,et al.  Sketching in adversarial environments , 2008, STOC.

[24]  Krzysztof Onak,et al.  Sketching and Streaming Entropy via Approximation Theory , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[25]  S. Muthukrishnan,et al.  Data streams: algorithms and applications , 2005, SODA '03.

[26]  Oded Goldreich,et al.  Foundations of Cryptography - A Primer , 2005, Found. Trends Theor. Comput. Sci..

[27]  Klaus Jansen,et al.  Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques , 2012, Lecture Notes in Computer Science.

[28]  David P. Woodruff Optimal space lower bounds for all frequency moments , 2004, SODA '04.

[29]  Ziv Bar-Yossef,et al.  An information statistics approach to data stream and communication complexity , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[30]  Moses Charikar,et al.  Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[31]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[32]  Aravind Srinivasan,et al.  Chernoff-Hoeffding bounds for applications with limited independence , 1995, SODA '93.

[33]  David J. DeWitt,et al.  Practical Skew Handling in Parallel Joins , 1992, VLDB.

[34]  I. Good C332. Surprise indexes and p-values , 1989 .

[35]  Jayadev Misra,et al.  Finding Repeated Elements , 1982, Sci. Comput. Program..

[36]  René Schoof,et al.  The Discrete Logarithm Problem , 2016, Open Problems in Mathematics.

[37]  Joachim von zur Gathen,et al.  Modern Computer Algebra (3. ed.) , 2003 .

[38]  U. Haagerup The best constants in the Khintchine inequality , 1981 .