Private Stochastic Convex Optimization: Optimal Rates in 𝓁1 Geometry

Stochastic convex optimization over an `1bounded domain is ubiquitous in machine learning applications such as LASSO but remains poorly understood when learning with differential privacy. We show that, up to logarithmic factors the optimal excess population loss of any (ε, δ)-differentially private optimizer is √ log(d)/n+ √ d/εn. The upper bound is based on a new algorithm that combines the iterative localization approach of Feldman et al. (2020a) with a new analysis of private regularized mirror descent. It applies to `p bounded domains for p ∈ [1, 2] and queries at most n gradients improving over the best previously known algorithm for the `2 case which needs n gradients. Further, we show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by √ log(d)/n+ (log(d)/εn). This bound is achieved by a new variance-reduced version of the Frank-Wolfe algorithm that requires just a single pass over the data. We also show that the lower bound in this case is the minimum of the two rates mentioned above.

[1]  Kunal Talwar,et al.  Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy Amplification by Shuffling , 2020, 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS).

[2]  Raef Bassily,et al.  Non-Euclidean Differentially Private Stochastic Convex Optimization , 2021, COLT.

[3]  Kunal Talwar,et al.  Private stochastic convex optimization: optimal rates in linear time , 2020, STOC.

[4]  Raef Bassily,et al.  Private Stochastic Convex Optimization with Optimal Rates , 2019, NeurIPS.

[5]  Volkan Cevher,et al.  Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator , 2019, ICML.

[6]  Tong Zhang,et al.  SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator , 2018, NeurIPS.

[7]  Yurii Nesterov,et al.  Relatively Smooth Convex Optimization by First-Order Methods, and Applications , 2016, SIAM J. Optim..

[8]  Vitaly Feldman,et al.  Generalization of ERM in Stochastic Convex Optimization: The Dimension Strikes Back , 2016, NIPS.

[9]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[10]  Yoram Singer,et al.  Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.

[11]  Thomas Steinke,et al.  Between Pure and Approximate Differential Privacy , 2015, J. Priv. Confidentiality.

[12]  Li Zhang,et al.  Nearly Optimal Private LASSO , 2015, NIPS.

[13]  Moni Naor,et al.  Pure Differential Privacy for Rectangle Queries via Private Partitions , 2015, ASIACRYPT.

[14]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[15]  Prateek Jain,et al.  (Near) Dimension Independent Risk Bounds for Differentially Private Learning , 2014, ICML.

[16]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[17]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[18]  Mark W. Schmidt,et al.  A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method , 2012, ArXiv.

[19]  Daniel Kifer,et al.  Private Convex Empirical Risk Minimization and High-dimensional Regression , 2012, COLT 2012.

[20]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[21]  Evgueni A. Haroutunian,et al.  Information Theory and Statistics , 2011, International Encyclopedia of Statistical Science.

[22]  Ambuj Tewari,et al.  Composite objective mirror descent , 2010, COLT 2010.

[23]  Moni Naor,et al.  Differential privacy under continual observation , 2010, STOC '10.

[24]  Ohad Shamir,et al.  Stochastic Convex Optimization , 2009, COLT.

[25]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[26]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.