CGM: An Enhanced Mechanism for Streaming Data Collectionwith Local Differential Privacy

Local differential privacy (LDP) is a well-established privacy protection scheme for collecting sensitive data, which has been integrated into major platforms such as iOS, Chrome, and Windows. The main idea is that each individual randomly perturbs her data on her local device, and only uploads the noisy version to an untrusted data aggregator. This paper focuses on the collection of streaming data consisting of regular updates, e.g., daily app usage. Such streams, when aggregated over a large population, often exhibit strong autocorrelations, e.g., the average usage of an app usually does not change dramatically from one day to the next. To our knowledge, this property has been largely neglected in existing LDP mechanisms. Consequently, data collected with current LDP methods often exhibit unrealistically violent fluctuations due to the added noise, drowning the overall trend, as shown in our experiments. This paper proposes a novel correlated Gaussian mechanism (CGM) for enforcing (ε , δ)-LDP on streaming data collection, which reduces noise by exploiting public-known autocorrelation patterns of the aggregated data. This is done through non-trivial modifications to the core of the underlying Gaussian Mechanism; in particular, CGM injects temporally correlated noise, computed through an optimization program that takes into account the given autocorrelation pattern, data value range, and utility metric. CGM comes with formal proof of correctness, and consumes negligible computational resources. Extensive experiments using real datasets from different application domains demonstrate that CGM achieves consistent and significant utility gains compared to the baseline method of repeatedly running the underlying one-shot LDP mechanism. PVLDB Reference Format: Ergute Bao, Yin Yang, Xiaokui Xiao, and Bolin Ding. CGM: An Enhanced Mechanism for Streaming Data Collection with Local Differential Privacy. PVLDB, 14(11): 2258-2270, 2021. doi:10.14778/3476249.3476277

[1]  Jeffrey F. Naughton,et al.  Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics , 2016, SIGMOD Conference.

[2]  Gaurav Kapoor,et al.  Protection Against Reconstruction and Its Applications in Private Federated Learning , 2018, ArXiv.

[3]  Moni Naor,et al.  Pan-Private Streaming Algorithms , 2010, ICS.

[4]  Úlfar Erlingsson,et al.  Prochlo: Strong Privacy for Analytics in the Crowd , 2017, SOSP.

[5]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[6]  Divesh Srivastava,et al.  Marginal Release Under Local Differential Privacy , 2017, SIGMOD Conference.

[7]  Bolin Ding,et al.  Collecting and analyzing data jointly from multiple services under local differential privacy , 2020, Proc. VLDB Endow..

[8]  Yu-Xiang Wang,et al.  Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising , 2018, ICML.

[9]  Kunal Talwar,et al.  Private selection from private candidates , 2018, STOC.

[10]  Ilya Mironov,et al.  Rényi Differential Privacy , 2017, 2017 IEEE 30th Computer Security Foundations Symposium (CSF).

[11]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[12]  Thomas Steinke,et al.  Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds , 2016, TCC.

[13]  Ninghui Li,et al.  Answering Multi-Dimensional Analytical Queries under Local Differential Privacy , 2019, SIGMOD Conference.

[14]  Janardhan Kulkarni,et al.  Collecting Telemetry Data Privately , 2017, NIPS.

[15]  Jun Zhao,et al.  Reviewing and Improving the Gaussian Mechanism for Differential Privacy , 2019, ArXiv.

[16]  Raef Bassily,et al.  Local, Private, Efficient Protocols for Succinct Histograms , 2015, STOC.

[17]  Zhicong Huang,et al.  DPSAaS: Multi-Dimensional Data Sharing and Analytics as Services under Local Differential Privacy , 2019, Proc. VLDB Endow..

[18]  Martin J. Wainwright,et al.  Local Privacy and Minimax Bounds: Sharp Rates for Probability Estimation , 2013, NIPS.

[19]  Mariana Raykova,et al.  Privacy-Preserving Distributed Linear Regression on High-Dimensional Data , 2017, Proc. Priv. Enhancing Technol..

[20]  Ashwin Machanavajjhala,et al.  IoT-Detective: Analyzing IoT Data Under Differential Privacy , 2018, SIGMOD Conference.

[21]  Aaron Roth,et al.  Differentially private combinatorial optimization , 2009, SODA '10.

[22]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[23]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[24]  Li Xiong,et al.  Real-time aggregate monitoring with differential privacy , 2012, CIKM.

[25]  Li Xiong,et al.  Protecting Locations with Differential Privacy under Temporal Correlations , 2014, CCS.

[26]  Hongxia Jin,et al.  Private spatial data aggregation in the local setting , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[27]  Ninghui Li,et al.  Locally Differentially Private Protocols for Frequency Estimation , 2017, USENIX Security Symposium.

[28]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[29]  Lin Zhang,et al.  One-month beijing taxi GPS trajectory dataset with taxi IDs and vehicle status , 2018, DATA@SenSys.

[30]  Yin Yang,et al.  Heavy Hitter Estimation over Set-Valued Data with Local Differential Privacy , 2016, CCS.

[31]  Aaron Roth,et al.  Local Differential Privacy for Evolving Data , 2018, NeurIPS.

[32]  Ashwin Machanavajjhala,et al.  PeGaSus: Data-Adaptive Differentially Private Stream Processing , 2017, CCS.

[33]  Li Xiong,et al.  An Adaptive Approach to Real-Time Aggregate Monitoring With Differential Privacy , 2014, IEEE Trans. Knowl. Data Eng..

[34]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[35]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[36]  Pramod Viswanath,et al.  Extremal Mechanisms for Local Differential Privacy , 2014, J. Mach. Learn. Res..

[37]  Úlfar Erlingsson,et al.  Scalable Private Learning with PATE , 2018, ICLR.

[38]  Masatoshi Yoshikawa,et al.  Quantifying Differential Privacy under Temporal Correlations , 2016, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[39]  Uri Stemmer,et al.  Heavy Hitters and the Structure of Local Privacy , 2017, PODS.