A Note on Sanitizing Streams with Differential Privacy

The literature on data sanitization aims to design algorithms that take an input dataset and produce a privacy-preserving version of it, that captures some of its statistical properties. In this note we study this question from a streaming perspective and our goal is to sanitize a data stream. Specifically, we consider low-memory algorithms that operate on a data stream and produce an alternative privacy-preserving stream that captures some statistical properties of the original input stream.

[1]  Omri Ben-Eliezer,et al.  Bounded Space Differentially Private Quantiles , 2022, Trans. Mach. Learn. Res..

[2]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[3]  Edo Liberty,et al.  Optimal Quantile Approximation in Streams , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[4]  Haim Kaplan,et al.  Privately Learning Thresholds: Closing the Exponential Gap , 2019, COLT.

[5]  Amos Beimel,et al.  Private Learning and Sanitization: Pure vs. Approximate Differential Privacy , 2013, APPROX-RANDOM.

[6]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[7]  Salil P. Vadhan,et al.  The Complexity of Differential Privacy , 2017, Tutorials on the Foundations of Cryptography.

[8]  Kobbi Nissim,et al.  Differentially Private Release and Learning of Threshold Functions , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[9]  Haim Kaplan,et al.  Differentially Private Approximate Quantiles , 2021, ICML.

[10]  Sanjeev Khanna,et al.  Space-efficient online computation of quantile summaries , 2001, SIGMOD '01.