Differentially Private Bootstrap: New Privacy Analysis and Inference Strategies

Differentially private (DP) mechanisms protect individual-level information by introducing randomness into the statistical analysis procedure. Despite the availability of numerous DP tools, there remains a lack of general techniques for conducting statistical inference under DP. We examine a DP bootstrap procedure that releases multiple private bootstrap estimates to infer the sampling distribution and construct confidence intervals (CIs). Our privacy analysis presents new results on the privacy cost of a single DP bootstrap estimate, applicable to any DP mechanisms, and identifies some misapplications of the bootstrap in the existing literature. Using the Gaussian-DP (GDP) framework (Dong et al.,2022), we show that the release of $B$ DP bootstrap estimates from mechanisms satisfying $(\mu/\sqrt{(2-2/\mathrm{e})B})$-GDP asymptotically satisfies $\mu$-GDP as $B$ goes to infinity. Moreover, we use deconvolution with the DP bootstrap estimates to accurately infer the sampling distribution, which is novel in DP. We derive CIs from our density estimate for tasks such as population mean estimation, logistic regression, and quantile regression, and we compare them to existing methods using simulations and real-world experiments on 2016 Canada Census data. Our private CIs achieve the nominal coverage level and offer the first approach to private inference for quantile regression.

[1]  Jordan Awan,et al.  Simulation-based, Finite-sample Inference for Privatized Data , 2023, ArXiv.

[2]  James Honaker,et al.  Unbiased Statistical Estimation and Valid Confidence Intervals Under Differential Privacy , 2021, Statistica Sinica.

[3]  S. Vadhan,et al.  Canonical Noise Distributions and Private Hypothesis Tests , 2021, The Annals of Statistics.

[4]  Audra McMillan,et al.  Non-parametric Differentially Private Confidence Intervals for the Median , 2021, ArXiv.

[5]  Yu-Xiang Wang,et al.  Optimal Accounting of Differential Privacy via Characteristic Function , 2021, AISTATS.

[6]  Sivakanth Gopi,et al.  Numerical Composition of Differential Privacy , 2021, NeurIPS.

[7]  F. Farokhi Deconvoluting kernel density estimation and regression for locally differentially private data , 2020, Scientific Reports.

[8]  D. Sheldon,et al.  Parametric Bootstrap for Differentially Private Confidence Intervals , 2020, AISTATS.

[9]  Jonathan Ullman,et al.  CoinPress: Practical Private Mean and Covariance Estimation , 2020, NeurIPS.

[10]  Qinqing Zheng,et al.  Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion , 2020, ICML.

[11]  Andrew Bray,et al.  Differentially Private Confidence Intervals , 2020, ArXiv.

[12]  A. Honkela,et al.  Computing Tight Differential Privacy Guarantees Using FFT , 2019, AISTATS.

[13]  Aaron Roth,et al.  Gaussian differential privacy , 2019, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[14]  Matthew Reimherr,et al.  KNG: The K-Norm Gradient Mechanism , 2019, NeurIPS.

[15]  Aleksandra B. Slavkovic,et al.  Differentially Private Inference for Binomial Data , 2019, J. Priv. Confidentiality.

[16]  Matthew Reimherr,et al.  Benefits and Pitfalls of the Exponential Mechanism with Applications to Hilbert Spaces and Functional PCA , 2019, ICML.

[17]  Yu-Xiang Wang,et al.  Subsampled Rényi Differential Privacy and Analytical Moments Accountant , 2018, AISTATS.

[18]  Borja Balle,et al.  Privacy Amplification by Subsampling: Tight Analyses via Couplings and Divergences , 2018, NeurIPS.

[19]  Thomas Steinke,et al.  Composable and versatile privacy via truncated CDP , 2018, STOC.

[20]  Yu-Xiang Wang,et al.  Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising , 2018, ICML.

[21]  Yue Wang,et al.  Differentially Private Confidence Intervals for Empirical Risk Minimization , 2018, J. Priv. Confidentiality.

[22]  Vishesh Karwa,et al.  Finite Sample Differentially Private Confidence Intervals , 2017, ITCS.

[23]  Ilya Mironov,et al.  Rényi Differential Privacy , 2017, 2017 IEEE 30th Computer Security Foundations Symposium (CSF).

[24]  Thomas Steinke,et al.  Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds , 2016, TCC.

[25]  Guy N. Rothblum,et al.  Concentrated Differential Privacy , 2016, ArXiv.

[26]  Bradley Efron,et al.  Empirical Bayes deconvolution estimates , 2016 .

[27]  Vito D'Orazio,et al.  Differential Privacy for Social Science Inference , 2015 .

[28]  Or Sheffet,et al.  Differentially Private Ordinary Least Squares , 2015, ICML.

[29]  Debdeep Pati,et al.  Bayesian Semiparametric Multivariate Density Deconvolution , 2014, Journal of the American Statistical Association.

[30]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[31]  J. Borwein,et al.  Convex Functions: Constructions, Characterizations and Counterexamples , 2010 .

[32]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[33]  Martin L. Hazelton,et al.  Nonparametric density deconvolution by weighted kernel estimators , 2009, Stat. Comput..

[34]  L. Wasserman,et al.  A Statistical Framework for Differential Privacy , 2008, 0811.2501.

[35]  Peter Hall,et al.  Estimation of distributions, moments and quantiles in deconvolution problems , 2008, 0810.4821.

[36]  M. Wells,et al.  Optimal bandwidth selection for multivariate kernel deconvolution density estimation , 2008 .

[37]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[38]  Stephen P. Boyd,et al.  Convex Optimization , 2004, IEEE Transactions on Automatic Control.

[39]  B. Efron,et al.  More accurate confidence intervals in exponential families , 1992 .

[40]  Jianqing Fan,et al.  Deconvolution with supersmooth distributions , 1992 .

[41]  Jianqing Fan On the Optimal Rates of Convergence for Nonparametric Deconvolution Problems , 1991 .

[42]  Peter Hall,et al.  On the bootstrap and likelihood-based confidence regions , 1987 .

[43]  B. Efron Better Bootstrap Confidence Intervals , 1987 .

[44]  Christine M. O'Keefe,et al.  Bootstrap Differential Privacy , 2019, Trans. Data Priv..

[45]  James Honaker,et al.  Bootstrap Inference and Differential Privacy: Standard Errors for Free∗ , 2018 .

[46]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[47]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[48]  Stefan Van Aelst,et al.  MULTIVARIATE REGRESSION S-ESTIMATORS FOR ROBUST ESTIMATION AND INFERENCE , 2005 .