Adversarially Robust Streaming Algorithms via Differential Privacy

A streaming algorithm is said to be adversarially robust if its accuracy guarantees are maintained even when the data stream is chosen maliciously, by an adaptive adversary. We establish a connection between adversarial robustness of streaming algorithms and the notion of differential privacy. This connection allows us to design new adversarially robust streaming algorithms that outperform the current state-of-the-art constructions for many interesting regimes of parameters.

[1]  Toniann Pitassi,et al.  Preserving Statistical Validity in Adaptive Data Analysis , 2014, STOC.

[2]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[3]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[4]  Michael Elkin,et al.  Efficient algorithms for constructing (1+∊,β)-spanners in the distributed and streaming models , 2006, Distributed Computing.

[5]  Philippe Flajolet,et al.  Probabilistic Counting Algorithms for Data Base Applications , 1985, J. Comput. Syst. Sci..

[6]  Adam O'Neill,et al.  Accessing Data while Preserving Privacy , 2017, ArXiv.

[7]  Raef Bassily,et al.  Algorithmic stability for adaptive data analysis , 2015, STOC.

[8]  Luca Trevisan,et al.  Counting Distinct Elements in a Data Stream , 2002, RANDOM.

[9]  Guy N. Rothblum,et al.  A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[10]  David P. Woodruff,et al.  Optimal approximations of the frequency moments of data streams , 2005, STOC '05.

[11]  Jelani Nelson,et al.  Continuous monitoring of $\ell_p$ norms in data streams , 2017 .

[12]  Thomas Steinke,et al.  The Limits of Post-Selection Generalization , 2018, NeurIPS.

[13]  Amos Beimel,et al.  Private Learning and Sanitization: Pure vs. Approximate Differential Privacy , 2013, APPROX-RANDOM.

[14]  Kobbi Nissim,et al.  Concentration Bounds for High Sensitivity Functions Through Differential Privacy , 2019, J. Priv. Confidentiality.

[15]  Eran Omri,et al.  Tighter Bounds on Multi-Party Coin Flipping via Augmented Weak Martingales and Differentially Private Sampling , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[16]  David P. Woodruff,et al.  Data Streams with Bounded Deletions , 2018, PODS.

[17]  S. Muthukrishnan,et al.  Data streams: algorithms and applications , 2005, SODA '03.

[18]  Sudipto Guha,et al.  Analyzing graph structure via linear measurements , 2012, SODA.

[19]  Haim Kaplan,et al.  Privately Learning Thresholds: Closing the Exponential Gap , 2019, COLT.

[20]  Jonathan Ullman,et al.  Preventing False Discovery in Interactive Data Analysis Is Hard , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[21]  Piotr Indyk,et al.  Maintaining Stream Statistics over Sliding Windows , 2002, SIAM J. Comput..

[22]  Thomas Steinke,et al.  Interactive fingerprinting codes and the hardness of preventing false discovery , 2014, 2016 Information Theory and Applications Workshop (ITA).

[23]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[24]  David P. Woodruff,et al.  How robust are linear sketches to adaptive inputs? , 2012, STOC '13.

[25]  David P. Woodruff,et al.  On the exact space complexity of sketching and streaming small norms , 2010, SODA '10.

[26]  David P. Woodruff,et al.  An optimal algorithm for the distinct elements problem , 2010, PODS '10.

[27]  Jian Zhang,et al.  Efficient algorithms for constructing (1+, varepsilon;, beta)-spanners in the distributed and streaming models. , 2004, PODC 2004.

[28]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[29]  Graham Cormode,et al.  What's hot and what's not: tracking most frequent items dynamically , 2003, PODS '03.

[30]  Moni Naor,et al.  On the complexity of differentially private data release: efficient algorithms and hardness results , 2009, STOC '09.

[31]  Moses Charikar,et al.  Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[32]  David P. Woodruff,et al.  A Framework for Adversarially Robust Streaming Algorithms , 2020, SIGMOD Rec..

[33]  David P. Woodruff,et al.  Reusable low-error compressive sampling schemes through privacy , 2012, 2012 IEEE Statistical Signal Processing Workshop (SSP).

[34]  Jelani Nelson Sketching and streaming algorithms , 2011 .

[35]  Moni Naor,et al.  Sketching in adversarial environments , 2008, STOC.

[36]  Atri Rudra,et al.  Recovering simple signals , 2012, 2012 Information Theory and Applications Workshop.

[37]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[38]  Kobbi Nissim,et al.  Differentially Private Release and Learning of Threshold Functions , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[39]  Eylon Yogev,et al.  The Adversarial Robustness of Sampling , 2019, IACR Cryptol. ePrint Arch..

[40]  Thomas Steinke,et al.  Composable and versatile privacy via truncated CDP , 2018, STOC.

[41]  Sudipto Guha,et al.  Graph sketches: sparsification, spanners, and subgraphs , 2012, PODS.