Private Stochastic Convex Optimization with Optimal Rates

We study differentially private (DP) algorithms for stochastic convex optimization (SCO). In this problem the goal is to approximately minimize the population loss given i.i.d.~samples from a distribution over convex and Lipschitz loss functions. A long line of existing work on private convex optimization focuses on the empirical loss and derives asymptotically tight bounds on the excess empirical loss. However a significant gap exists in the known bounds for the population loss. We show that, up to logarithmic factors, the optimal excess population loss for DP algorithms is equal to the larger of the optimal non-private excess population loss, and the optimal excess empirical loss of DP algorithms. This implies that, contrary to intuition based on private ERM, private SCO has asymptotically the same rate of $1/\sqrt{n}$ as non-private SCO in the parameter regime most common in practice. The best previous result in this setting gives rate of $1/n^{1/4}$. Our approach builds on existing differentially private algorithms and relies on the analysis of algorithmic stability to ensure generalization.

[1]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[2]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[3]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[4]  Jeffrey F. Naughton,et al.  Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics , 2016, SIGMOD Conference.

[5]  Dawn Song,et al.  Towards Practical Differentially Private Convex Optimization , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[6]  Adam D. Smith,et al.  Differentially Private Feature Selection via Stability Arguments, and the Robustness of the Lasso , 2013, COLT.

[7]  Lin Xiao,et al.  A Proximal Stochastic Gradient Method with Progressive Variance Reduction , 2014, SIAM J. Optim..

[8]  Raef Bassily,et al.  Algorithmic stability for adaptive data analysis , 2015, STOC.

[9]  Adam D. Smith,et al.  Is Interaction Necessary for Distributed Private Learning? , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[10]  Sébastien Bubeck,et al.  Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[11]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[12]  Toniann Pitassi,et al.  Preserving Statistical Validity in Adaptive Data Analysis , 2014, STOC.

[13]  Kamalika Chaudhuri,et al.  Privacy-preserving logistic regression , 2008, NIPS.

[14]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[15]  Li Zhang,et al.  Nearly Optimal Private LASSO , 2015, NIPS.

[16]  Ambuj Tewari,et al.  On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization , 2008, NIPS.

[17]  Santosh S. Vempala,et al.  Statistical Query Algorithms for Mean Vector Estimation and Stochastic Convex Optimization , 2015, SODA.

[18]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[19]  Di Wang,et al.  Differentially Private Empirical Risk Minimization Revisited: Faster and More General , 2018, NIPS.

[20]  Ohad Shamir,et al.  Stochastic Convex Optimization , 2009, COLT.

[21]  Yoram Singer,et al.  Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.

[22]  Prateek Jain,et al.  (Near) Dimension Independent Risk Bounds for Differentially Private Learning , 2014, ICML.

[23]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[24]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[25]  Jan Vondrák,et al.  High probability generalization bounds for uniformly stable algorithms with nearly optimal rate , 2019, COLT.

[26]  Pravesh Kothari,et al.  25th Annual Conference on Learning Theory Differentially Private Online Learning , 2022 .

[27]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[28]  R. Hardwarsing Stochastic Gradient Descent with Differentially Private Updates , 2018 .

[29]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[30]  Jonathan Ullman,et al.  Private Multiplicative Weights Beyond Linear Queries , 2014, PODS.

[31]  Dimitri P. Bertsekas,et al.  The effect of deterministic noise in subgradient methods , 2010, Math. Program..

[32]  Vitaly Feldman,et al.  Generalization of ERM in Stochastic Convex Optimization: The Dimension Strikes Back , 2016, NIPS.

[33]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[34]  Daniel Kifer,et al.  Private Convex Empirical Risk Minimization and High-dimensional Regression , 2012, COLT 2012.