Bootstrap Differential Privacy

This paper concerns the challenge of protecting confidentiality while making statistically useful data and analytical outputs available for research and policy analysis. In this context, the confidentiality protection measure known as differential privacy is an attractive methodology because of its clear definition and the strong guarantees that it promises. However, concerns about differential privacy include the possibility that in some situations the guarantees may be so strong that statistical usefulness is unacceptably low. In this paper, we propose an example of a relaxation of differential privacy that allows confidentiality protection to be balanced against statistical usefulness. We give a practical illustration of the relaxation implemented as Laplace noise addition for confidentiality protection of contingency and magnitude tables. Tables are amongst the most common types of output produced by national statistical agencies, and these outputs are often protected by noise addition.

[1]  Cynthia Dwork,et al.  Privacy, accuracy, and consistency too: a holistic solution to contingency table release , 2007, PODS.

[2]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[3]  Donald B Rubin,et al.  Individual privacy versus public good: protecting confidentiality in health research , 2015, Statistics in medicine.

[4]  Yves Croissant,et al.  Panel data econometrics in R: The plm package , 2008 .

[5]  James O. Chipperfield,et al.  The Australian Bureau of Statistics and releasing frequency tables via a remote server , 2016 .

[6]  Bing-Rong Lin,et al.  Towards a Systematic Analysis of Privacy Definitions , 2014, J. Priv. Confidentiality.

[7]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[8]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[9]  Josep Domingo-Ferrer,et al.  Statistical Disclosure Control , 2012 .

[10]  Anne-Sophie Charest,et al.  On the Meaning and Limits of Empirical Differential Privacy , 2016, J. Priv. Confidentiality.

[11]  George T. Duncan,et al.  Why Statistical Confidentiality , 2011 .

[12]  Josep Domingo-Ferrer,et al.  Wiley Series in Survey Methodology , 2012 .

[13]  Ashwin Machanavajjhala,et al.  Privacy: Theory meets Practice on the Map , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[14]  Cynthia Dwork,et al.  Differential privacy and robust statistics , 2009, STOC '09.

[15]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[16]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[17]  Larry A. Wasserman,et al.  Random Differential Privacy , 2011, J. Priv. Confidentiality.

[18]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[19]  Robert Tibshirani,et al.  Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy , 1986 .

[20]  Stephen E. Fienberg,et al.  Differential Privacy and the Risk-Utility Tradeoff for Multi-dimensional Contingency Tables , 2010, Privacy in Statistical Databases.

[21]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..