On the Tradeoff Between Data-Privacy and Utility for Data Publishing

A typical method for privacy-preserving data publishing mechanism is to add random noise to the original data for publishing. No matter what kind of noise is added, there is a chance that the original state can be estimated in a certain accuracy. The probability of the original data inferred by the malicious receiver in a given interval is measured by $(\alpha,\ \beta)$ -data-privacy. With random noise added to the original data, the utility of the published data will decrease. In this paper, we investigate the tradeoff between data privacy and data utility under $(\alpha,\beta)$ - data-privacy, aiming to seek an optimal noise distribution. To maximize the weighted sum of privacy and utility we prove that when the added noise is symmetric and the data utility is measured by $\ell^{1}$ - or $\ell^2$ -norm function, the optimal noise follows the uniform distribution. Then we further investigate the optimal noise to maximize data utility with a certain privacy guarantee and we derive that the optimal noise is a group of impulse functions. Finally, we compare $(\alpha, \beta)$ -data-privacy with differential privacy and obtain the inequality relationship between the two privacy parameters. Simulations are conducted to validate the correctness of the obtained results.

[1]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[2]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[3]  Tim Roughgarden,et al.  Universally utility-maximizing privacy mechanisms , 2008, STOC '09.

[4]  Junshan Zhang,et al.  From Social Group Utility Maximization to Personalized Location Privacy in Mobile Networks , 2017, IEEE/ACM Transactions on Networking.

[5]  Kunal Talwar,et al.  On the geometry of differential privacy , 2009, STOC '10.

[6]  Lei Ying,et al.  On the relation between identifiability, differential privacy, and mutual-information privacy , 2014, 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[7]  Jun Zhao Relations among different privacy notions , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[8]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[9]  Jiayi Chen Defending against inference attack in online social networks , 2017 .

[10]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[11]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[12]  Vitaly Shmatikov,et al.  The cost of privacy: destruction of data-mining utility in anonymized data publishing , 2008, KDD.

[13]  Ninghui Li,et al.  On the tradeoff between privacy and utility in data publishing , 2009, KDD.

[14]  H. Vincent Poor,et al.  Utility-Privacy Tradeoffs in Databases: An Information-Theoretic Approach , 2011, IEEE Transactions on Information Forensics and Security.

[15]  Pramod Viswanath,et al.  The Optimal Noise-Adding Mechanism in Differential Privacy , 2012, IEEE Transactions on Information Theory.

[16]  Jianping He,et al.  Differential private noise adding mechanism: Basic conditions and its application , 2017, 2017 American Control Conference (ACC).

[17]  Jun Tang,et al.  Privacy Loss in Apple's Implementation of Differential Privacy on MacOS 10.12 , 2017, ArXiv.

[18]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[19]  Mukund Sundararajan,et al.  Universally optimal privacy mechanisms for minimax agents , 2010, PODS '10.

[20]  Rathindra Sarathy,et al.  Some Additional Insights on Applying Differential Privacy for Numeric Data , 2010, Privacy in Statistical Databases.

[21]  Ling Shi,et al.  Private and Accurate Data Aggregation against Dishonest Nodes , 2016, ArXiv.

[22]  P. Shekelle,et al.  Systematic Review: The Evidence That Publishing Patient Care Performance Data Improves Quality of Care , 2008, Annals of Internal Medicine.

[23]  Xinping Guan,et al.  Preserving Data-Privacy With Added Noises: Optimal Estimation and Privacy Analysis , 2017, IEEE Transactions on Information Theory.

[24]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[25]  Xinping Guan,et al.  Privacy-Preserving Average Consensus: Privacy Analysis and Algorithm Design , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[26]  Wenzhong Li,et al.  Efficient Multi-User Computation Offloading for Mobile-Edge Cloud Computing , 2015, IEEE/ACM Transactions on Networking.

[27]  Weisong Shi,et al.  Edge Computing: Vision and Challenges , 2016, IEEE Internet of Things Journal.