Advanced Probabilistic Couplings for Differential Privacy

Differential privacy is a promising formal approach to data privacy, which provides a quantitative bound on the privacy cost of an algorithm that operates on sensitive information. Several tools have been developed for the formal verification of differentially private algorithms, including program logics and type systems. However, these tools do not capture fundamental techniques that have emerged in recent years, and cannot be used for reasoning about cutting-edge differentially private algorithms. Existing techniques fail to handle three broad classes of algorithms: 1) algorithms where privacy depends on accuracy guarantees, 2) algorithms that are analyzed with the advanced composition theorem, which shows slower growth in the privacy cost, 3) algorithms that interactively accept adaptive inputs. We address these limitations with a new formalism extending apRHL, a relational program logic that has been used for proving differential privacy of non-interactive algorithms, and incorporating aHL, a (non-relational) program logic for accuracy properties. We illustrate our approach through a single running example, which exemplifies the three classes of algorithms and explores new variants of the Sparse Vector technique, a well-studied algorithm from the privacy literature. We implement our logic in EasyCrypt, and formally verify privacy. We also introduce a novel coupling technique called optimal subset coupling that may be of independent interest.

[1]  Ninghui Li,et al.  Understanding the Sparse Vector Technique for Differential Privacy , 2016, Proc. VLDB Endow..

[2]  Benjamin Grégoire,et al.  Formal certification of code-based cryptographic proofs , 2009, POPL '09.

[3]  Catuscia Palamidessi,et al.  Geo-indistinguishability: differential privacy for location-based systems , 2012, CCS.

[4]  Benjamin Grégoire,et al.  Coupling proofs are probabilistic product programs , 2016, POPL.

[5]  Gilles Barthe,et al.  Beyond Differential Privacy: Composition Theorems and Relational Logic for f-divergences between Probabilistic Programs , 2013, ICALP.

[6]  Matteo Maffei,et al.  Differential Privacy by Typing in Security Protocols , 2013, 2013 IEEE 26th Computer Security Foundations Symposium.

[7]  Benjamin Grégoire,et al.  Proving Differential Privacy via Probabilistic Couplings , 2016, 2016 31st Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).

[8]  Gilles Barthe,et al.  Proving Differential Privacy in Hoare Logic , 2014, 2014 IEEE 27th Computer Security Foundations Symposium.

[9]  Danfeng Zhang,et al.  LightDP: towards automating differential privacy proofs , 2016, POPL.

[10]  Aaron Roth,et al.  Privacy Odometers and Filters: Pay-as-you-Go Composition , 2016, NIPS.

[11]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[12]  Bonnie Berger,et al.  Enabling Privacy Preserving GWAS in Heterogeneous Human Populations , 2016, RECOMB.

[13]  Moni Naor,et al.  On the complexity of differentially private data release: efficient algorithms and hardness results , 2009, STOC '09.

[14]  Kamalika Chaudhuri,et al.  The Large Margin Mechanism for Differentially Private Maximization , 2014, NIPS.

[15]  Toniann Pitassi,et al.  The reusable holdout: Preserving validity in adaptive data analysis , 2015, Science.

[16]  Frank McSherry,et al.  Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.

[17]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[18]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[19]  Thomas Steinke,et al.  Make Up Your Mind: The Price of Online Queries in Differential Privacy , 2016, SODA.

[20]  Aniket Kate,et al.  Differentially private data aggregation with optimal utility , 2014, ACSAC '14.

[21]  Somesh Jha,et al.  Satisfiability modulo counting: a new approach for analyzing privacy properties , 2014, CSL-LICS.

[22]  Omer Reingold,et al.  Computational Differential Privacy , 2009, CRYPTO.

[23]  Elaine Shi,et al.  Private and Continual Release of Statistics , 2010, TSEC.

[24]  Thomas Steinke,et al.  Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds , 2016, TCC.

[25]  Guy N. Rothblum,et al.  A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[26]  Huimin Lin,et al.  Metrics for Differential Privacy in Concurrent Systems , 2014, FORTE.

[27]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[28]  Benjamin Grégoire,et al.  A Program Logic for Union Bounds , 2016, ICALP.

[29]  Margaret Martonosi,et al.  DP-WHERE: Differentially private modeling of human mobility , 2013, 2013 IEEE International Conference on Big Data.

[30]  Guy N. Rothblum,et al.  Concentrated Differential Privacy , 2016, ArXiv.

[31]  Kim G. Larsen,et al.  Bisimulation through Probabilistic Testing , 1991, Inf. Comput..

[32]  Dilsun Kirli Kaynar,et al.  Formal Verification of Differential Privacy for Interactive Systems , 2011, ArXiv.

[33]  Tim Roughgarden,et al.  Interactive privacy via the median mechanism , 2009, STOC '10.

[34]  Gilles Barthe,et al.  Higher-Order Approximate Relational Refinement Types for Mechanism Design and Differential Privacy , 2014, POPL.

[35]  Ashwin Machanavajjhala,et al.  Privacy: Theory meets Practice on the Map , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[36]  Stephen E. Fienberg,et al.  Privacy Preserving GWAS Data Sharing , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[37]  George Danezis,et al.  Verified Computational Differential Privacy with Applications to Smart Metering , 2013, 2013 IEEE 26th Computer Security Foundations Symposium.

[38]  H. Thorisson Coupling, stationarity, and regeneration , 2000 .

[39]  Moni Naor,et al.  Differential privacy under continual observation , 2010, STOC '10.

[40]  Benjamin C. Pierce,et al.  Distance makes the types grow stronger: a calculus for differential privacy , 2010, ICFP '10.

[41]  Phillip Rogaway,et al.  On Generalized Feistel Networks , 2010, CRYPTO.

[42]  Cynthia Dwork,et al.  Differential privacy and robust statistics , 2009, STOC '09.

[43]  Elaine Shi,et al.  GUPT: privacy preserving data analysis made easy , 2012, SIGMOD Conference.

[44]  Adam D. Smith,et al.  Differentially Private Feature Selection via Stability Arguments, and the Robustness of the Lasso , 2013, COLT.

[45]  Andreas Haeberlen,et al.  Linear dependent types for differential privacy , 2013, POPL.

[46]  Benjamin Grégoire,et al.  Relational Reasoning via Probabilistic Coupling , 2015, LPAR.

[47]  Vitaly Shmatikov,et al.  Airavat: Security and Privacy for MapReduce , 2010, NSDI.

[48]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[49]  Gilles Barthe,et al.  Probabilistic Relational Reasoning for Differential Privacy , 2012, TOPL.

[50]  Kim G. Larsen,et al.  Bisimulation through probabilistic testing (preliminary report) , 1989, POPL '89.

[51]  Danfeng Zhang,et al.  AutoPriv: Automating Differential Privacy Proofs , 2016, ArXiv.

[52]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[53]  Pramod Viswanath,et al.  The Composition Theorem for Differential Privacy , 2013, IEEE Transactions on Information Theory.

[54]  David Sands,et al.  Differential Privacy , 2015, POPL.

[55]  Ilya Mironov,et al.  (Not So) Random Shuffles of RC4 , 2002, IACR Cryptol. ePrint Arch..