Detecting Violations of Differential Privacy

The widespread acceptance of differential privacy has led to the publication of many sophisticated algorithms for protecting privacy. However, due to the subtle nature of this privacy definition, many such algorithms have bugs that make them violate their claimed privacy. In this paper, we consider the problem of producing counterexamples for such incorrect algorithms. The counterexamples are designed to be short and human-understandable so that the counterexample generator can be used in the development process -- a developer could quickly explore variations of an algorithm and investigate where they break down. Our approach is statistical in nature. It runs a candidate algorithm many times and uses statistical tests to try to detect violations of differential privacy. An evaluation on a variety of incorrect published algorithms validates the usefulness of our approach: it correctly rejects incorrect algorithms and provides counterexamples for them within a few seconds.

[1]  Danfeng Zhang,et al.  LightDP: towards automating differential privacy proofs , 2016, POPL.

[2]  Dilsun Kirli Kaynar,et al.  Formal Verification of Differential Privacy for Interactive Systems , 2011, ArXiv.

[3]  Ashwin Machanavajjhala,et al.  Differentially Private Algorithms for Empirical Machine Learning , 2014, ArXiv.

[4]  Cesare Tinelli,et al.  The SMT-LIB Standard: Version 1.2 , 2005 .

[5]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2016, J. Priv. Confidentiality.

[6]  David Sands,et al.  Differential Privacy , 2015, POPL.

[7]  Marco Gaboardi,et al.  Relational Symbolic Execution , 2017, PPDP.

[8]  Ashwin Machanavajjhala,et al.  On the Privacy Properties of Variants on the Sparse Vector Technique , 2015, ArXiv.

[9]  Ilya Mironov,et al.  Rényi Differential Privacy , 2017, 2017 IEEE 30th Computer Security Foundations Symposium (CSF).

[10]  Dawson R. Engler,et al.  EXE: automatically generating inputs of death , 2006, CCS '06.

[11]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[12]  Benjamin Grégoire,et al.  Proving Differential Privacy via Probabilistic Couplings , 2016, 2016 31st Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).

[13]  Dave Clarke,et al.  Noninterference via Symbolic Execution , 2012, FMOODS/FORTE.

[14]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[15]  Huimin Lin,et al.  Metrics for Differential Privacy in Concurrent Systems , 2014, FORTE.

[16]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[17]  Thomas Steinke,et al.  Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds , 2016, TCC.

[18]  Gilles Barthe,et al.  Probabilistic Relational Reasoning for Differential Privacy , 2012, TOPL.

[19]  Pierre-Yves Strub,et al.  Advanced Probabilistic Couplings for Differential Privacy , 2016, CCS.

[20]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[21]  Ninghui Li,et al.  Understanding the Sparse Vector Technique for Differential Privacy , 2016, Proc. VLDB Endow..

[22]  Aws Albarghouthi,et al.  Synthesizing coupling proofs of differential privacy , 2017, Proc. ACM Program. Lang..

[23]  Frank McSherry,et al.  Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.

[24]  R. A. Fisher,et al.  Design of Experiments , 1936 .

[25]  Yu Zhang,et al.  Differentially Private High-Dimensional Data Publication via Sampling-Based Inference , 2015, KDD.

[26]  Matthew B. Dwyer,et al.  Differential symbolic execution , 2008, SIGSOFT '08/FSE-16.

[27]  Chris Clifton,et al.  Top-k frequent itemsets via differentially private FP-trees , 2014, KDD.

[28]  Guy N. Rothblum,et al.  Concentrated Differential Privacy , 2016, ArXiv.

[29]  Vitaly Shmatikov,et al.  Airavat: Security and Privacy for MapReduce , 2010, NSDI.

[30]  Elaine Shi,et al.  GUPT: privacy preserving data analysis made easy , 2012, SIGMOD Conference.

[31]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[32]  Gilles Barthe,et al.  Beyond Differential Privacy: Composition Theorems and Relational Logic for f-divergences between Probabilistic Programs , 2013, ICALP.

[33]  Moni Naor,et al.  On the complexity of differentially private data release: efficient algorithms and hardness results , 2009, STOC '09.

[34]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[35]  Gilles Barthe,et al.  Proving Differential Privacy in Hoare Logic , 2014, 2014 IEEE 27th Computer Security Foundations Symposium.

[36]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[37]  Benjamin C. Pierce,et al.  Distance makes the types grow stronger: a calculus for differential privacy , 2010, ICFP '10.

[38]  Andreas Haeberlen,et al.  Linear dependent types for differential privacy , 2013, POPL.

[39]  George Danezis,et al.  Verified Computational Differential Privacy with Applications to Smart Metering , 2013, 2013 IEEE 26th Computer Security Foundations Symposium.

[40]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.