论文信息 - Practical and Private (Deep) Learning without Sampling or Shuffling

Practical and Private (Deep) Learning without Sampling or Shuffling

We consider training models with differential privacy (DP) using mini-batch gradients. The existing state-of-the-art, Differentially Private Stochastic Gradient Descent (DP-SGD), requires privacy amplification by sampling or shuffling to obtain the best privacy/accuracy/computation trade-offs. Unfortunately, the precise requirements on exact sampling and shuffling can be hard to obtain in important practical scenarios, particularly federated learning (FL). We design and analyze a DP variant of Follow-TheRegularized-Leader (DP-FTRL) that compares favorably (both theoretically and empirically) to amplified DP-SGD, while allowing for much more flexible data access patterns. DP-FTRL does not use any form of privacy amplification.

[1] Elaine Shi,et al. Private and Continual Release of Statistics , 2010, TSEC.

[2] Martin J. Wainwright,et al. Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[3] Ohad Shamir,et al. Stochastic Convex Optimization , 2009, COLT.

[4] Úlfar Erlingsson,et al. Encode, Shuffle, Analyze Privacy Revisited: Formalizations and Empirical Evaluation , 2020, ArXiv.

[5] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[6] Ian Goodfellow,et al. Deep Learning with Differential Privacy , 2016, CCS.

[7] Oliver Kosut,et al. A Better Bound Gives a Hundred Rounds: Enhanced Privacy Guarantees via f-Divergences , 2020, 2020 IEEE International Symposium on Information Theory (ISIT).

[8] Vitaly Shmatikov,et al. Auditing Data Provenance in Text-Generation Models , 2018, KDD.

[9] S L Warner,et al. Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[10] Salil P. Vadhan,et al. The Complexity of Differential Privacy , 2017, Tutorials on the Foundations of Cryptography.

[11] Elad Hazan,et al. An optimal algorithm for stochastic strongly-convex optimization , 2010, 1006.2425.

[12] Sofya Raskhodnikova,et al. What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[13] Raef Bassily,et al. Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses , 2020, NeurIPS.

[14] Daniel Kifer,et al. Private Convex Empirical Risk Minimization and High-dimensional Regression , 2012, COLT 2012.

[15] H. Brendan McMahan,et al. Training Production Language Models without Memorizing User Data , 2020, ArXiv.

[16] Yu-Xiang Wang,et al. Poission Subsampled Rényi Differential Privacy , 2019, ICML.

[17] Ilya Mironov,et al. Rényi Differential Privacy , 2017, 2017 IEEE 30th Computer Security Foundations Symposium (CSF).

[18] Lin Xiao,et al. Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization , 2009, J. Mach. Learn. Res..

[19] Anand D. Sarwate,et al. Stochastic gradient descent with differentially private updates , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[20] Cynthia Dwork,et al. Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[21] Kunal Talwar,et al. Private stochastic convex optimization: optimal rates in linear time , 2020, STOC.

[22] Manzil Zaheer,et al. Adaptive Federated Optimization , 2020, ICLR.

[23] Jonathan Ullman,et al. Auditing Differentially Private Machine Learning: How Private is Private SGD? , 2020, NeurIPS.

[24] Anand D. Sarwate,et al. Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[25] Raef Bassily,et al. Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[26] H. Brendan McMahan,et al. A survey of Algorithms and Analysis for Adaptive Online Learning , 2014, J. Mach. Learn. Res..

[27] Elad Hazan,et al. Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[28] Alexandre V. Evfimievski,et al. Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[29] Úlfar Erlingsson,et al. Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity , 2018, SODA.

[30] H. Brendan McMahan,et al. Follow-the-Regularized-Leader and Mirror Descent: Equivalence Theorems and L1 Regularization , 2011, AISTATS.

[31] Karan Singh,et al. The Price of Differential Privacy for Online Learning , 2017, ICML.

[32] Raef Bassily,et al. Private Stochastic Convex Optimization with Optimal Rates , 2019, NeurIPS.

[33] James Honaker. Efficient Use of Differentially Private Binary Trees , 2015 .

[34] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[35] Gregory Cohen,et al. EMNIST: Extending MNIST to handwritten letters , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[36] H. Brendan McMahan,et al. A General Approach to Adding Differential Privacy to Iterative Training Procedures , 2018, ArXiv.

[37] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .

[38] Pravesh Kothari,et al. 25th Annual Conference on Learning Theory Differentially Private Online Learning , 2022 .

[39] Aaron Roth,et al. The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[40] Sara van de Geer,et al. Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .

[41] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[42] Matthew J. Streeter,et al. Adaptive Bound Optimization for Online Convex Optimization , 2010, COLT 2010.

[43] Richard Nock,et al. Advances and Open Problems in Federated Learning , 2021, Found. Trends Mach. Learn..

[44] Kunal Talwar,et al. Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy Amplification by Shuffling , 2020, 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS).

[45] Adam D. Smith,et al. (Nearly) Optimal Algorithms for Private Online Learning in Full-information and Bandit Settings , 2013, NIPS.

[46] H. Robbins. A Stochastic Approximation Method , 1951 .

[47] Swaroop Ramaswamy,et al. Understanding Unintended Memorization in Federated Learning , 2020, ArXiv.

[48] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[49] Jeffrey F. Naughton,et al. Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics , 2016, SIGMOD Conference.

[50] Dan Boneh,et al. Differentially Private Learning Needs Better Features (or Much More Data) , 2020, ICLR.

[51] Dawn Song,et al. Towards Practical Differentially Private Convex Optimization , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[52] Jacob Abernethy,et al. Online Learning via the Differential Privacy Lens , 2019, NeurIPS.

[53] Borja Balle,et al. Privacy Amplification via Random Check-Ins , 2020, NeurIPS.

[54] Úlfar Erlingsson,et al. Tempered Sigmoid Activations for Deep Learning with Differential Privacy , 2020, AAAI.

[55] Shuang Song,et al. Making the Shoe Fit: Architectures, Initializations, and Tuning for Learning with Privacy , 2019 .

[56] Moni Naor,et al. Differential privacy under continual observation , 2010, STOC '10.

[57] Moni Naor,et al. Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[58] Sashank J. Reddi,et al. AdaCliP: Adaptive Clipping for Private SGD , 2019, ArXiv.

[59] Vitaly Feldman,et al. Privacy Amplification by Iteration , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[60] H. Brendan McMahan,et al. Learning Differentially Private Recurrent Language Models , 2017, ICLR.

[61] Milad Nasr,et al. Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning , 2021, 2021 IEEE Symposium on Security and Privacy (SP).

[62] H. Brendan McMahan,et al. Differentially Private Learning with Adaptive Clipping , 2019, NeurIPS.

[63] Ambuj Tewari,et al. Composite objective mirror descent , 2010, COLT 2010.

[64] Yu-Xiang Wang,et al. Subsampled Rényi Differential Privacy and Analytical Moments Accountant , 2018, AISTATS.

[65] Hubert Eichner,et al. Towards Federated Learning at Scale: System Design , 2019, SysML.

[66] Thomas Steinke,et al. The Discrete Gaussian for Differential Privacy , 2020, NeurIPS.

[67] Santosh S. Vempala,et al. Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..

[68] Martin Wattenberg,et al. Ad click prediction: a view from the trenches , 2013, KDD.

[69] Prateek Jain,et al. (Near) Dimension Independent Risk Bounds for Differentially Private Learning , 2014, ICML.

[70] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[71] Claudio Gentile,et al. On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.