Differentially Private High-Dimensional Data Publication via Markov Network

Differentially private data publication has recently received considerable attention. However, it faces some challenges in differentially private high-dimensional data publication, such as the complex attribute relationships, the high computational complexity and data sparsity. Therefore, we propose PrivMN, a novel method to publish high-dimensional data with differential privacy guarantee. We first use the Markov model to represent the mutual relationships between attributes to solve the problem that the direction of relationship between variables cannot be determined in practical application. We then take advantage of approximate inference to calculate the joint distribution of high-dimensional data under differential privacy to figure out the computational and spatial complexity of accurate reasoning. Extensive experiments on real datasets demonstrate that our solution makes the published high-dimensional synthetic datasets more efficient under the guarantee of differential privacy.

[1]  Tim Roughgarden,et al.  Interactive privacy via the median mechanism , 2009, STOC '10.

[2]  Peter Clifford,et al.  Markov Random Fields in Statistics , 2012 .

[3]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[4]  Li Xiong,et al.  DPCube: Releasing Differentially Private Data Cubes for Health Information , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[5]  Chris Clifton,et al.  Top-k frequent itemsets via differentially private FP-trees , 2014, KDD.

[6]  Ninghui Li,et al.  Differentially private grids for geospatial data , 2012, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[7]  Yu Zhang,et al.  Differentially Private High-Dimensional Data Publication via Sampling-Based Inference , 2015, KDD.

[8]  Philip S. Yu,et al.  $\textsf{LoPub}$ : High-Dimensional Crowdsourced Data Publication With Local Differential Privacy , 2016, IEEE Transactions on Information Forensics and Security.

[9]  Wei Zhang,et al.  Differentially Private Network Data Release via Stochastic Kronecker Graph , 2016, WISE.

[10]  Dan Suciu,et al.  Boosting the accuracy of differentially private histograms through consistency , 2009, Proc. VLDB Endow..

[11]  Aaron Roth,et al.  Differentially Private Approximation Algorithms , 2009, ArXiv.

[12]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[13]  Mario Stefanelli,et al.  Contribution to the discussion of the paper by Steffen L. Lauritzen and David Spiegelhalter: "Local Computations with Probabilities on Graphical Structures and their Application to Expert Systems" , 1988 .

[14]  Xing Xie,et al.  PrivTree: A Differentially Private Algorithm for Hierarchical Decompositions , 2016, SIGMOD Conference.

[15]  J. M. Hammersley,et al.  Markov fields on finite graphs and lattices , 1971 .

[16]  Benjamin C. M. Fung,et al.  Publishing set-valued data via differential privacy , 2011, Proc. VLDB Endow..

[17]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[18]  Nir Friedman,et al.  Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning , 2009 .

[19]  Guy N. Rothblum,et al.  A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[20]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[21]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[22]  Shui Yu,et al.  Big Privacy: Challenges and Opportunities of Privacy Study in the Age of Big Data , 2016, IEEE Access.

[23]  Xiang Cheng,et al.  Differentially private multi-party high-dimensional data publishing , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[24]  Ninghui Li,et al.  PriView: practical differentially private release of marginal contingency tables , 2014, SIGMOD Conference.

[25]  Cynthia Dwork,et al.  Privacy, accuracy, and consistency too: a holistic solution to contingency table release , 2007, PODS.

[26]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[27]  Ju Ren,et al.  DPPro: Differentially Private High-Dimensional Data Release via Random Projection , 2017, IEEE Transactions on Information Forensics and Security.

[28]  Daniel A. Spielman,et al.  Spectral Graph Theory and its Applications , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[29]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.