DP-Finder: Finding Differential Privacy Violations by Sampling and Optimization

We present DP-Finder, a novel approach and system that automatically derives lower bounds on the differential privacy enforced by algorithms. Lower bounds are practically useful as they can show tightness of existing upper bounds or even identify incorrect upper bounds. Computing a lower bound involves searching for a counterexample, defined by two neighboring inputs and a set of outputs, that identifies a large privacy violation. This is an inherently hard problem as finding such a counterexample involves inspecting a large (usually infinite) and sparse search space. To address this challenge, DP-Finder relies on two key insights. First, we introduce an effective and precise correlated sampling method to estimate the privacy violation of a counterexample. Second, we show how to obtain a differentiable version of the problem, enabling us to phrase the search task as an optimization objective to be maximized with state-of-the-art numerical optimizers. This allows us to systematically search for large privacy violations. Our experimental results indicate that DP-Finder is effective in computing differential privacy lower bounds for a number of randomized algorithms. For instance, it finds tight lower bounds in algorithms that obfuscate their input in a non-trivial fashion.

[1]  Gilles Barthe,et al.  Programming language techniques for differential privacy , 2016, SIGL.

[2]  D. Hinkley On the ratio of two correlated normal random variables , 1969 .

[3]  Aaron Roth,et al.  Iterative Constructions and Private Data Release , 2011, TCC.

[4]  Charles Elkan,et al.  Differential Privacy and Machine Learning: a Survey and Review , 2014, ArXiv.

[5]  Armando Solar-Lezama,et al.  REAS: Combining Numerical Optimization with SAT Solving , 2018, ArXiv.

[6]  H. Kahn,et al.  Methods of Reducing Sample Size in Monte Carlo Computations , 1953, Oper. Res..

[7]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2016, J. Priv. Confidentiality.

[8]  Andreas Haeberlen,et al.  Linear dependent types for differential privacy , 2013, POPL.

[9]  Kunal Talwar,et al.  On the geometry of differential privacy , 2009, STOC '10.

[10]  Sofya Raskhodnikova,et al.  Analyzing Graphs with Node Differential Privacy , 2013, TCC.

[11]  Timon Gehr,et al.  PSI: Exact Symbolic Inference for Probabilistic Programs , 2016, CAV.

[12]  Benjamin I. P. Rubinstein,et al.  Pain-Free Random Differential Privacy with Sensitivity Sampling , 2017, ICML.

[13]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[14]  Benjamin C. Pierce,et al.  Distance makes the types grow stronger: a calculus for differential privacy , 2010, ICFP '10.

[15]  Úlfar Erlingsson,et al.  Prochlo: Strong Privacy for Analytics in the Crowd , 2017, SOSP.

[16]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[17]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[18]  Aleksandar Nikolov,et al.  Lower Bounds for Differential Privacy from Gaussian Width , 2016, SoCG.

[19]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[20]  Dawn Xiaodong Song,et al.  Towards Practical Differential Privacy for SQL Queries , 2017, Proc. VLDB Endow..

[21]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[22]  Anindya De,et al.  Lower Bounds in Differential Privacy , 2011, TCC.

[23]  Frank McSherry,et al.  Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.

[24]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[25]  Gilles Barthe,et al.  Proving Differential Privacy in Hoare Logic , 2014, 2014 IEEE 27th Computer Security Foundations Symposium.

[26]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[27]  Ninghui Li,et al.  Understanding the Sparse Vector Technique for Differential Privacy , 2016, Proc. VLDB Endow..

[28]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[29]  Benjamin Grégoire,et al.  Proving uniformity and independence by self-composition and coupling , 2017, LPAR.

[30]  Aws Albarghouthi,et al.  Synthesizing coupling proofs of differential privacy , 2017, Proc. ACM Program. Lang..

[31]  Irit Dinur,et al.  Revealing information while preserving privacy , 2003, PODS.

[32]  Andreas Haeberlen,et al.  A framework for adaptive differential privacy , 2017, Proc. ACM Program. Lang..

[33]  Sharon Goldberg,et al.  Calibrating Data to Sensitivity in Private Data Analysis , 2012, Proc. VLDB Endow..

[34]  Ashwin Machanavajjhala,et al.  On the Privacy Properties of Variants on the Sparse Vector Technique , 2015, ArXiv.