Differentially Private Bayesian Inference for Generalized Linear Models

The framework of differential privacy (DP) upper bounds the information disclosure risk involved in using sensitive datasets for statistical analysis. A DP mechanism typically operates by adding carefully calibrated noise to the data release procedure. Generalized linear models (GLMs) are among the most widely used arms in data analyst's repertoire. In this work, with logistic and Poisson regression as running examples, we propose a generic noise-aware Bayesian framework to quantify the parameter uncertainty for a GLM at hand, given noisy sufficient statistics. We perform a tight privacy analysis and experimentally demonstrate that the posteriors obtained from our model, while adhering to strong privacy guarantees, are similar to the non-private posteriors.

[1]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[2]  Antti Honkela,et al.  Differentially Private Markov Chain Monte Carlo , 2019, NeurIPS.

[3]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[4]  Ashwin Machanavajjhala,et al.  No free lunch in data privacy , 2011, SIGMOD '11.

[5]  Andrew Gelman,et al.  General methods for monitoring convergence of iterative simulations , 1998 .

[6]  Or Sheffet,et al.  Differentially Private Ordinary Least Squares , 2015, ICML.

[7]  Yu-Xiang Wang,et al.  Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising , 2018, ICML.

[8]  Yin Yang,et al.  Functional Mechanism: Regression Analysis under Differential Privacy , 2012, Proc. VLDB Endow..

[9]  Yu-Xiang Wang,et al.  Revisiting differentially private linear regression: optimal and adaptive prediction & estimation in unbounded domain , 2018, UAI.

[10]  Moti Yung,et al.  Differentially-Private "Draw and Discard" Machine Learning , 2018, ArXiv.

[11]  Jinhui Xu,et al.  Estimating Smooth GLM in Non-interactive Local Differential Privacy Model with Public Unlabeled Data , 2019, ALT.

[12]  Ninghui Li,et al.  Privacy at Scale: Local Dierential Privacy in Practice , 2018 .

[13]  Daniel Kifer,et al.  Private Convex Optimization for Empirical Risk Minimization with Applications to High-dimensional Regression , 2012, COLT.

[14]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[15]  Ashwin Machanavajjhala,et al.  Differentially Private Significance Tests for Regression Coefficients , 2017, Journal of Computational and Graphical Statistics.

[16]  G. C. Wick The Evaluation of the Collision Matrix , 1950 .

[17]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[18]  Frank McSherry,et al.  Probabilistic Inference and Differential Privacy , 2010, NIPS.

[19]  Prateek Jain,et al.  (Near) Dimension Independent Risk Bounds for Differentially Private Learning , 2014, ICML.

[20]  Dawn Song,et al.  Towards Practical Differentially Private Convex Optimization , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[21]  Ilya Mironov,et al.  Differentially private recommender systems: building privacy into the net , 2009, KDD.

[22]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[23]  Daniel Sheldon,et al.  Differentially Private Bayesian Inference for Exponential Families , 2018, NeurIPS.

[24]  Vishesh Karwa,et al.  Inference using noisy degrees: Differentially private $\beta$-model and synthetic graphs , 2012, 1205.4697.

[25]  Dejing Dou,et al.  Preserving differential privacy in convolutional deep belief networks , 2017, Machine Learning.

[26]  Daniel Sheldon,et al.  Differentially Private Bayesian Linear Regression , 2019, NeurIPS.

[27]  Dorota Kurowicka,et al.  Generating random correlation matrices based on vines and extended onion method , 2009, J. Multivar. Anal..

[28]  Justin Thaler,et al.  Faster Algorithms for Privately Releasing Marginals , 2012, ICALP.

[29]  Di Wang,et al.  Differentially Private Empirical Risk Minimization with Non-convex Loss Functions , 2019, ICML.

[30]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[31]  Liwei Wang,et al.  Efficient Private ERM for Smooth Objectives , 2017, IJCAI.

[32]  Antti Honkela,et al.  Differentially Private Variational Inference for Non-conjugate Models , 2016, UAI.

[33]  Zhiwei Steven Wu,et al.  Locally Private Bayesian Inference for Count Models , 2018, ICML.

[34]  Antti Honkela,et al.  Efficient differentially private learning improves drug sensitivity prediction , 2016, Biology Direct.

[35]  Sinan Yildirim,et al.  Exact MCMC with differentially private moves , 2019, Statistics and Computing.

[36]  A. Ihler,et al.  On the Theory and Practice of Privacy-Preserving Bayesian Data Analysis , 2016 .

[37]  Aleksandra B. Slavkovic,et al.  Differentially Private Exponential Random Graphs , 2014, Privacy in Statistical Databases.

[38]  Cynthia Dwork,et al.  Differential Privacy for Statistics: What we Know and What we Want to Learn , 2010, J. Priv. Confidentiality.

[39]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[40]  James R. Foulds,et al.  Variational Bayes In Private Settings (VIPS) , 2016, J. Artif. Intell. Res..

[41]  Christos Dimitrakakis,et al.  Robust and Private Bayesian Inference , 2013, ALT.

[42]  Ryan P. Adams,et al.  PASS-GLM: polynomial approximate sufficient statistics for scalable Bayesian GLM inference , 2017, NIPS.

[43]  Lawrence Carin,et al.  On Connecting Stochastic Gradient MCMC and Differential Privacy , 2017, AISTATS.

[44]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[45]  Adam D. Smith,et al.  Efficient, Differentially Private Point Estimators , 2008, ArXiv.

[46]  Aleksandra B. Slavkovic,et al.  Differential Privacy for Clinical Trial Data: Preliminary Evaluations , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[47]  Jeffrey F. Naughton,et al.  Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics , 2016, SIGMOD Conference.

[48]  Stephen T. Joy The Differential Privacy of Bayesian Inference , 2015 .