Learning rate adaptation for differentially private stochastic gradient descent

Differentially private learning has recently emerged as the leading approach for privacy-preserving machine learning. Differential privacy can complicate learning procedures because each access to the data needs to be carefully designed and carries a privacy cost. For example, standard parameter tuning with a validation set cannot be easily applied. In this paper, we propose a differentially private algorithm for the adaptation of the learning rate for differentially private stochastic gradient descent (SGD) that avoids the need for validation set use. The idea for the adaptiveness comes from the technique of extrapolation in classical numerical analysis: to get an estimate for the error against the gradient flow which underlies SGD, we compare the result obtained by one full step and two half-steps. We prove the privacy of the method using the moments accountant mechanism. This allows us to compute tight privacy bounds. Empirically we show that our method is competitive with manually tuned commonly used optimisation methods for training deep neural networks and differentially private variational inference.

[1]  Nathan Srebro,et al.  The Marginal Value of Adaptive Gradient Methods in Machine Learning , 2017, NIPS.

[2]  K. Kersting,et al.  Differentially Private Variational Inference for Non-conjugate Models , 2016, UAI.

[3]  R. Hardwarsing Stochastic Gradient Descent with Differentially Private Updates , 2018 .

[4]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[5]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[6]  Christos Dimitrakakis,et al.  On the Differential Privacy of Bayesian Inference , 2015, AAAI.

[7]  Juha Karhunen,et al.  Approximate Riemannian Conjugate Gradient Learning for Fixed-Form Variational Bayes , 2010, J. Mach. Learn. Res..

[8]  James R. Foulds,et al.  Variational Bayes In Private Settings (VIPS) , 2016, J. Artif. Intell. Res..

[9]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[10]  Ernst Hairer,et al.  Solving Ordinary Differential Equations I: Nonstiff Problems , 2009 .

[11]  Arun Rajkumar,et al.  A Differentially Private Stochastic Gradient Descent Algorithm for Multiparty Classification , 2012, AISTATS.

[12]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[13]  Josef Stoer,et al.  Extrapolation methods for the solution of initial value problems and their practical realization , 1974 .

[14]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[15]  Li Zhang,et al.  Analyze gauss: optimal bounds for privacy-preserving principal component analysis , 2014, STOC.

[16]  Roman Garnett,et al.  Differentially Private Bayesian Optimization , 2015, ICML.

[17]  Christos Dimitrakakis,et al.  Robust and Private Bayesian Inference , 2013, ALT.

[18]  Lawrence Carin,et al.  On Connecting Stochastic Gradient MCMC and Differential Privacy , 2017, AISTATS.

[19]  Kamalika Chaudhuri,et al.  Privacy-preserving logistic regression , 2008, NIPS.

[20]  Javier Romero,et al.  Coupling Adaptive Batch Sizes with Learning Rates , 2016, UAI.

[21]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[22]  Alexander J. Smola,et al.  Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo , 2015, ICML.

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.