Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo

We consider the problem of Bayesian learning on sensitive datasets and present two simple but somewhat surprising results that connect Bayesian learning to "differential privacy", a cryptographic approach to protect individual-level privacy while permitting database-level utility. Specifically, we show that under standard assumptions, getting one sample from a posterior distribution is differentially private "for free"; and this sample as a statistical estimator is often consistent, near optimal, and computationally tractable. Similarly but separately, we show that a recent line of work that use stochastic gradient for Hybrid Monte Carlo (HMC) sampling also preserve differentially privacy with minor or no modifications of the algorithmic procedure at all, these observations lead to an "anytime" algorithm for Bayesian learning under privacy constraint. We demonstrate that it performs much better than the state-of-the-art differential private methods on synthetic and real datasets.

[1]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[2]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[3]  Van Der Vaart,et al.  The Bernstein-Von-Mises theorem under misspecification , 2012 .

[4]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[5]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[6]  Shiva Prasad Kasiviswanathan,et al.  On the 'Semantics' of Differential Privacy: A Bayesian Formulation , 2008, J. Priv. Confidentiality.

[7]  Cynthia Dwork,et al.  Differential privacy and robust statistics , 2009, STOC '09.

[8]  Daniel Kifer,et al.  Private Convex Empirical Risk Minimization and High-dimensional Regression , 2012, COLT 2012.

[9]  K. Zygalakis,et al.  (Non-) asymptotic properties of Stochastic Gradient Langevin Dynamics , 2015, 1501.00438.

[10]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[11]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[12]  Frank McSherry,et al.  Probabilistic Inference and Differential Privacy , 2010, NIPS.

[13]  H. Robbins A Stochastic Approximation Method , 1951 .

[14]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[16]  Tianqi Chen,et al.  Stochastic Gradient Hamiltonian Monte Carlo , 2014, ICML.

[17]  J. Rosenthal Minorization Conditions and Convergence Rates for Markov Chain Monte Carlo , 1995 .

[18]  S. Walker,et al.  Bayesian asymptotics with misspecified models , 2013 .

[19]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[20]  Amos Beimel,et al.  Bounds on the sample complexity for private learning and private data release , 2010, Machine Learning.

[21]  Christos Dimitrakakis,et al.  Robust and Private Bayesian Inference , 2013, ALT.

[22]  Rebecca N. Wright,et al.  Differential privacy: an exploration of the privacy-utility landscape , 2013 .

[23]  Arun Rajkumar,et al.  A Differentially Private Stochastic Gradient Descent Algorithm for Multiparty Classification , 2012, AISTATS.

[24]  David Applegate,et al.  Sampling and integration of near log-concave functions , 1991, STOC '91.

[25]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[26]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[27]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[28]  Michael Rabadi,et al.  Kernel Methods for Machine Learning , 2015 .

[29]  Li Xiong,et al.  Bayesian inference under differential privacy , 2012, 1203.0617.

[30]  E. Lehmann Testing Statistical Hypotheses , 1960 .

[31]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[32]  Andrea Bergmann,et al.  Statistical Parametric Mapping The Analysis Of Functional Brain Images , 2016 .

[33]  Ahn,et al.  Bayesian posterior sampling via stochastic gradient Fisher scoring Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring , 2012 .

[34]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[35]  Yulong Lu On the Bernstein-Von Mises Theorem for High Dimensional Nonlinear Bayesian Inverse Problems , 2017, 1706.00289.

[36]  Daniel M. Roy,et al.  Complexity of Inference in Latent Dirichlet Allocation , 2011, NIPS.

[37]  Daniel Kifer,et al.  Private Convex Optimization for Empirical Risk Minimization with Applications to High-dimensional Regression , 2012, COLT.

[38]  Anand D. Sarwate,et al.  Stochastic gradient descent with differentially private updates , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[39]  S. Ghosal,et al.  2 The Dirichlet process , related priors and posterior asymptotics , 2009 .

[40]  Raef Bassily,et al.  Private Empirical Risk Minimization, Revisited , 2014, ArXiv.

[41]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  A. U.S.,et al.  Predictability , Complexity , and Learning , 2002 .

[43]  S. Ghosal Bayesian Nonparametrics: The Dirichlet process, related priors and posterior asymptotics , 2010 .

[44]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[45]  Ryan Babbush,et al.  Bayesian Sampling Using Stochastic Gradient Thermostats , 2014, NIPS.

[46]  Stephen E. Fienberg,et al.  Learning with Differential Privacy: Stability, Learnability and the Sufficiency and Necessity of ERM Principle , 2015, J. Mach. Learn. Res..

[47]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[48]  Adam D. Smith,et al.  Efficient, Differentially Private Point Estimators , 2008, ArXiv.

[49]  Toniann Pitassi,et al.  Preserving Statistical Validity in Adaptive Data Analysis , 2014, STOC.

[50]  Hiroshi Nakagawa,et al.  Approximation Analysis of Stochastic Gradient Langevin Dynamics by using Fokker-Planck Equation and Ito Process , 2014, ICML.