Privacy-Preserving Correlated Data Publication: Privacy Analysis and Optimal Noise Design

The privacy issue in data publication is critical and has been extensively studied. Correlation is unavoidable in data publication, which universally manifests intrinsic correlations owing to social, physical, behavioral, and genetic relationships. However, most of the existing works assume that private data is independent, i.e., the correlation among data is neglected. In this paper, we investigate the privacy concern of data publication where deterministic and probabilistic correlations are considered, respectively. Specifically, $(\varepsilon, \delta)$-multi-dimensional data-privacy (MDDP) is proposed to quantify the correlated data privacy. It characterizes the disclosure probability of the published data being jointly estimated with the correlation under a given accuracy. Then, we explore the effects of deterministic and probabilistic correlations on privacy disclosure, respectively. For both kinds of correlations, it is shown that the privacy disclosure with correlations increases compared to the one without correlation knowledge. Meanwhile, a closed-form expression of disclosure probability and a strict bound of privacy disclosure gain are derived, respectively. To minimize the disclosure probability, we provide the optimal noise distribution in the sense of $(\varepsilon, \delta)$-MDDP. Extensive simulations on a real dataset verify our analytical results.

[1]  Xiao Lu,et al.  Real-Time and Spatio-Temporal Crowd-Sourced Social Network Data Publishing with Differential Privacy , 2018, IEEE Transactions on Dependable and Secure Computing.

[2]  Hiroshi Nakagawa,et al.  Bayesian Differential Privacy on Correlated Data , 2015, SIGMOD Conference.

[3]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[4]  Flávio du Pin Calmon,et al.  Privacy against statistical inference , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[5]  Xinping Guan,et al.  Privacy-Preserving Average Consensus: Privacy Analysis and Algorithm Design , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[6]  Lei Ying,et al.  On the relation between identifiability, differential privacy, and mutual-information privacy , 2014, 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[7]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Moni Naor,et al.  Differential privacy under continual observation , 2010, STOC '10.

[9]  Philip S. Yu,et al.  Correlated network data publication via differential privacy , 2013, The VLDB Journal.

[10]  Tianqing Zhu,et al.  Correlated Differential Privacy: Hiding Information in Non-IID Data Set , 2015, IEEE Transactions on Information Forensics and Security.

[11]  Jiming Chen,et al.  Differentially Private Maximum Consensus: Design, Analysis and Impossibility Result , 2019, IEEE Transactions on Network Science and Engineering.

[12]  Jayant R. Haritsa,et al.  Maintaining Data Privacy in Association Rule Mining , 2002, VLDB.

[13]  Xinping Guan,et al.  Preserving Data-Privacy With Added Noises: Optimal Estimation and Privacy Analysis , 2017, IEEE Transactions on Information Theory.

[14]  Tatsuaki Okamoto,et al.  Secure Integration of Asymmetric and Symmetric Encryption Schemes , 1999, Journal of Cryptology.

[15]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[16]  Jianping He,et al.  Privacy-Preserving Correlated Data Publication with a Noise Adding Mechanism , 2020, 2020 IEEE 16th International Conference on Control & Automation (ICCA).

[17]  Masatoshi Yoshikawa,et al.  Quantifying Differential Privacy in Continuous Data Release Under Temporal Correlations , 2017, IEEE Transactions on Knowledge and Data Engineering.

[18]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[19]  Paul W. Cuff,et al.  Differential Privacy as a Mutual Information Constraint , 2016, CCS.

[20]  Henrik Sandberg,et al.  Optimal privacy-preserving policy using constrained additive noise to minimize the fisher information , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[21]  H. Vincent Poor,et al.  Utility-Privacy Tradeoffs in Databases: An Information-Theoretic Approach , 2011, IEEE Transactions on Information Forensics and Security.

[22]  Philip S. Yu,et al.  Differential Privacy and Applications , 2017, Advances in Information Security.

[23]  C. Hoffmann,et al.  The role of privacy concerns in the sharing economy , 2017 .

[24]  Henrik Sandberg,et al.  Ensuring Privacy with Constrained Additive Noise by Minimizing Fisher Information , 2018, Autom..

[25]  Li Xiong,et al.  Protecting Locations with Differential Privacy under Temporal Correlations , 2014, CCS.

[26]  Jun Zhang,et al.  PrivBayes: private data release via bayesian networks , 2014, SIGMOD Conference.