Recursive mechanism: towards node differential privacy and unrestricted joins

Existing differential privacy (DP) studies mainly consider aggregation on data sets where each entry corresponds to a particular participant to be protected. In many situations, a user may pose a relational algebra query on a database with sensitive data, and desire differentially private aggregation on the result of the query. However, no existing work is able to release such aggregation when the query contains unrestricted join operations. This severely limits the applications of existing DP techniques because many data analysis tasks require unrestricted joins. One example is subgraph counting on a graph. Furthermore, existing methods for differentially private subgraph counting support only edge DP and are subject to very simple subgraphs. Until recent, whether any nontrivial graph statistics can be released with reasonable accuracy for arbitrary kind of input graphs under node DP was still an open problem. In this paper, we propose a novel differentially private mechanism that supports unrestricted joins, to release an approximation of a linear statistic of the result of some positive relational algebra calculation over a sensitive database. The error bound of the approximate answer is roughly proportional to the empirical sensitivity of the query --- a new notion that measures the maximum possible change to the query answer when a participant withdraws its data from the sensitive database. For subgraph counting, our mechanism provides a solution to achieve node DP, for any kind of subgraphs.

[1]  Ashwin Machanavajjhala,et al.  No free lunch in data privacy , 2011, SIGMOD '11.

[2]  Sofya Raskhodnikova,et al.  Analyzing Graphs with Node Differential Privacy , 2013, TCC.

[3]  David D. Jensen,et al.  Accurate Estimation of the Degree Distribution of Private Networks , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[4]  Frank McSherry,et al.  Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.

[5]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[6]  Catuscia Palamidessi,et al.  Differential Privacy for Relational Algebra: Improving the Sensitivity Bounds via Constraint Systems , 2012, QAPL.

[7]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[8]  Jian Pei,et al.  A brief survey on anonymization techniques for privacy preserving publishing of social network data , 2008, SKDD.

[9]  Avrim Blum,et al.  Differentially private data analysis of social networks via restricted sensitivity , 2012, ITCS '13.

[10]  Val Tannen,et al.  Provenance semirings , 2007, PODS.

[11]  Cynthia Dwork,et al.  Differential privacy and robust statistics , 2009, STOC '09.

[12]  Tim Roughgarden,et al.  Interactive privacy via the median mechanism , 2009, STOC '10.

[13]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[14]  Sofya Raskhodnikova,et al.  Private analysis of graph structure , 2011, Proc. VLDB Endow..

[15]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[16]  Dan Suciu,et al.  Relationship privacy: output perturbation for queries with joins , 2009, PODS.