On the Meaning and Limits of Empirical Differential Privacy

Empirical differential privacy (EDP) has been proposed as an alternative to differential privacy (DP), with the important advantages that the procedure can be applied to any bayesian model and requires less technical work from the part of the user. While EDP has been shown to be easy to implement, little is known of its theoretical underpinnings. This paper proposes a careful investigation of the meaning and limits of EDP as a measure of privacy. We show that EDP can not simply be considered an empirical version of DP, and that it could instead be thought of as a sensitivity measure on posterior distributions. We also show that EDP is not well-defined, in that its value depends crucially on the choice of discretization used in the procedure, and that it can be very computationnaly intensive to apply in practice. We illustrate these limitations with two simple conjugate bayesian model: the beta-binomial model and the normal-normal model.

[1]  Khaled El Emam,et al.  The application of differential privacy to health data , 2012, EDBT-ICDT '12.

[2]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[3]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[4]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[5]  John M. Abowd,et al.  A New Method for Protecting Interrelated Time Series with Bayesian Prior Distributions and Synthetic Data , 2015 .

[6]  Ashwin Machanavajjhala,et al.  Privacy: Theory meets Practice on the Map , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[7]  Kamalika Chaudhuri,et al.  Privacy-preserving logistic regression , 2008, NIPS.

[8]  Lars Vilhuber,et al.  Differential Privacy Applications to Bayesian and Linear Mixed Model Estimation , 2013, J. Priv. Confidentiality.

[9]  W. E. Johnson I.—PROBABILITY: THE DEDUCTIVE AND INDUCTIVE PROBLEMS , 1932 .

[10]  Larry A. Wasserman,et al.  Random Differential Privacy , 2011, J. Priv. Confidentiality.

[11]  Cynthia Dwork,et al.  Differential Privacy for Statistics: What we Know and What we Want to Learn , 2010, J. Priv. Confidentiality.

[12]  Lars Vilhuber,et al.  How Protective Are Synthetic Data? , 2008, Privacy in Statistical Databases.

[13]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[14]  Elisa Bertino,et al.  Private record matching using differential privacy , 2010, EDBT '10.

[15]  Adam D. Smith,et al.  Privacy-preserving statistical estimation with optimal convergence rates , 2011, STOC '11.