CALM: Consistent Adaptive Local Marginal for Marginal Release under Local Differential Privacy

Marginal tables are the workhorse of capturing the correlations among a set of attributes. We consider the problem of constructing marginal tables given a set of user's multi-dimensional data while satisfying Local Differential Privacy (LDP), a privacy notion that protects individual user's privacy without relying on a trusted third party. Existing works on this problem perform poorly in the high-dimensional setting; even worse, some incur very expensive computational overhead. In this paper, we propose CALM, Consistent Adaptive Local Marginal, that takes advantage of the careful challenge analysis and performs consistently better than existing methods. More importantly, CALM can scale well with large data dimensions and marginal sizes. We conduct extensive experiments on several real world datasets. Experimental results demonstrate the effectiveness and efficiency of CALM over existing methods.

[1]  Yu Zhang,et al.  Differentially Private High-Dimensional Data Publication via Sampling-Based Inference , 2015, KDD.

[2]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[3]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[4]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[5]  Raef Bassily,et al.  Practical Locally Private Heavy Hitters , 2017, NIPS.

[6]  Úlfar Erlingsson,et al.  Scalable Private Learning with PATE , 2018, ICLR.

[7]  Marco Gaboardi,et al.  Efficient Empirical Risk Minimization with Smooth Loss Functions in Non-interactive Local Differential Privacy , 2018, ArXiv.

[8]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[9]  Ninghui Li,et al.  PriView: practical differentially private release of marginal contingency tables , 2014, SIGMOD Conference.

[10]  Ninghui Li,et al.  Locally Differentially Private Protocols for Frequency Estimation , 2017, USENIX Security Symposium.

[11]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[12]  Divesh Srivastava,et al.  Marginal Release Under Local Differential Privacy , 2017, SIGMOD Conference.

[13]  Sanjeev Khanna,et al.  Distributed Private Heavy Hitters , 2012, ICALP.

[14]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[15]  G. Kuperberg,et al.  New constructions for covering designs , 1995, math/9502238.

[16]  Ninghui Li,et al.  Locally Differentially Private Frequent Itemset Mining , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[17]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[18]  W. Karush Minima of Functions of Several Variables with Inequalities as Side Conditions , 2014 .

[19]  Ron Kohavi,et al.  Real world performance of association rule algorithms , 2001, KDD '01.

[20]  S. Ruggles Integrated Public Use Microdata Series , 2021, Encyclopedia of Gerontology and Population Aging.

[21]  Liwei Wang,et al.  Collect at Once, Use Effectively: Making Non-interactive Locally Private Learning Possible , 2017, ICML.

[22]  Ninghui Li,et al.  Differential Privacy: From Theory to Practice , 2016, Differential Privacy.

[23]  Shusen Yang,et al.  Bayesian network-based high-dimensional crowdsourced data publication with local differential privacy , 2019 .

[24]  Nina Mishra,et al.  Privacy via pseudorandom sketches , 2006, PODS.

[25]  Raef Bassily,et al.  Local, Private, Efficient Protocols for Succinct Histograms , 2015, STOC.

[26]  Salil P. Vadhan,et al.  The Complexity of Differential Privacy , 2017, Tutorials on the Foundations of Cryptography.

[27]  Janardhan Kulkarni,et al.  Collecting Telemetry Data Privately , 2017, NIPS.

[28]  Adam D. Smith,et al.  Is Interaction Necessary for Distributed Private Learning? , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[29]  Ju Ren,et al.  DPPro: Differentially Private High-Dimensional Data Release via Random Projection , 2017, IEEE Transactions on Information Forensics and Security.

[30]  Yin Yang,et al.  Heavy Hitter Estimation over Set-Valued Data with Local Differential Privacy , 2016, CCS.

[31]  Ninghui Li,et al.  Locally Differentially Private Heavy Hitter Identification , 2017, IEEE Transactions on Dependable and Secure Computing.

[32]  Dawn Xiaodong Song,et al.  Practical Differential Privacy for SQL Queries Using Elastic Sensitivity , 2017, ArXiv.

[33]  Yin Yang,et al.  Collecting and Analyzing Data from Smart Device Users with Local Differential Privacy , 2016, ArXiv.

[34]  Cynthia Dwork,et al.  Privacy, accuracy, and consistency too: a holistic solution to contingency table release , 2007, PODS.

[35]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[36]  Philip S. Yu,et al.  $\textsf{LoPub}$ : High-Dimensional Crowdsourced Data Publication With Local Differential Privacy , 2016, IEEE Transactions on Information Forensics and Security.

[37]  Úlfar Erlingsson,et al.  Building a RAPPOR with the Unknown: Privacy-Preserving Learning of Associations and Data Dictionaries , 2015, Proc. Priv. Enhancing Technol..

[38]  Uri Stemmer,et al.  Heavy Hitters and the Structure of Local Privacy , 2017, PODS.