Sensitivity Analysis for Non-Interactive Differential Privacy: Bounds and Efficient Algorithms

Differential privacy (DP) has gained significant attention lately as the state of the art in privacy protection. It achieves privacy by adding noise to query answers. We study the problem of privately and accurately answering a set of statistical range queries in batch mode (i.e., under non-interactive DP). The noise magnitude in DP depends directly on the sensitivity of a query set, and calculating sensitivity was proven to be NP-hard. Therefore, efficiently bounding the sensitivity of a given query set is still an open research problem. In this work, we propose upper bounds on sensitivity that are tighter than those in previous work. We also propose a formulation to exactly calculate sensitivity for a set of COUNT queries. However, it is impractical to implement these bounds without sophisticated methods. We therefore introduce methods that build a graph model <inline-formula><tex-math notation="LaTeX">$G$</tex-math><alternatives><mml:math><mml:mi>G</mml:mi></mml:math><inline-graphic xlink:href="inan-ieq1-2734664.gif"/></alternatives></inline-formula> based on a query set <inline-formula><tex-math notation="LaTeX">$Q$</tex-math><alternatives><mml:math><mml:mi>Q</mml:mi></mml:math><inline-graphic xlink:href="inan-ieq2-2734664.gif"/></alternatives></inline-formula>, such that implementing the aforementioned bounds can be achieved by solving two well-known clique problems on <inline-formula><tex-math notation="LaTeX">$G$</tex-math><alternatives><mml:math><mml:mi>G</mml:mi></mml:math><inline-graphic xlink:href="inan-ieq3-2734664.gif"/></alternatives></inline-formula>. We make use of the literature in solving these clique problems to realize our bounds efficiently. Experimental results show that for query sets with a few hundred queries, it takes only a few seconds to obtain results.

[1]  Yücel Saygin,et al.  Graph-based modelling of query sets for differential privacy , 2016, SSDBM.

[2]  Andreas Haeberlen,et al.  Linear dependent types for differential privacy , 2013, POPL.

[3]  David Eppstein,et al.  Listing All Maximal Cliques in Large Sparse Real-World Graphs , 2011, JEAL.

[4]  Yue Wang,et al.  A Data- and Workload-Aware Query Answering Algorithm for Range Queries Under Differential Privacy , 2014, Proc. VLDB Endow..

[5]  Marco Gaboardi,et al.  Sensitivity of Counting Queries , 2016, ICALP.

[6]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[7]  David Eppstein,et al.  Listing All Maximal Cliques in Sparse Graphs in Near-optimal Time , 2010, Exact Complexity of NP-hard Problems.

[8]  Ashwin Machanavajjhala,et al.  No free lunch in data privacy , 2011, SIGMOD '11.

[9]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[10]  Chu-Min Li,et al.  An Efficient Branch-and-Bound Algorithm Based on MaxSAT for the Maximum Clique Problem , 2010 .

[11]  Yufei Tao,et al.  Output perturbation with query relaxation , 2008, Proc. VLDB Endow..

[12]  Akira Tanaka,et al.  The worst-case time complexity for generating all maximal cliques and computational experiments , 2006, Theor. Comput. Sci..

[13]  Andrew McGregor,et al.  The matrix mechanism: optimizing linear counting queries under differential privacy , 2015, The VLDB Journal.

[14]  Yin Yang,et al.  Low-Rank Mechanism: Optimizing Batch Queries under Differential Privacy , 2012, Proc. VLDB Endow..

[15]  Benjamin C. Pierce,et al.  Distance makes the types grow stronger: a calculus for differential privacy , 2010, ICFP '10.

[16]  Ashwin Machanavajjhala,et al.  Principled Evaluation of Differentially Private Algorithms using DPBench , 2015, SIGMOD Conference.

[17]  Tim Roughgarden,et al.  Universally utility-maximizing privacy mechanisms , 2008, STOC '09.

[18]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, ACM Comput. Surv..

[19]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[20]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[21]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[22]  Katrina Ligett,et al.  A Simple and Practical Algorithm for Differentially Private Data Release , 2010, NIPS.

[23]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[24]  Catuscia Palamidessi,et al.  Differential Privacy for Relational Algebra: Improving the Sensitivity Bounds via Constraint Systems , 2012, QAPL.

[25]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[26]  J. Moon,et al.  On cliques in graphs , 1965 .

[27]  Qinghua Wu,et al.  A review on algorithms for maximum clique problems , 2015, Eur. J. Oper. Res..

[28]  Elaine Shi,et al.  GUPT: privacy preserving data analysis made easy , 2012, SIGMOD Conference.

[29]  Vitaly Shmatikov,et al.  Airavat: Security and Privacy for MapReduce , 2010, NSDI.

[30]  Andrew McGregor,et al.  Optimizing linear counting queries under differential privacy , 2009, PODS.

[31]  Moni Naor,et al.  On the Difficulties of Disclosure Prevention in Statistical Databases or The Case for Differential Privacy , 2010, J. Priv. Confidentiality.