Privacy Amplification via Random Check-Ins

Differentially Private Stochastic Gradient Descent (DP-SGD) forms a fundamental building block in many applications for learning over sensitive data. Two standard approaches, privacy amplification by subsampling, and privacy amplification by shuffling, permit adding lower noise in DP-SGD than via na\"ive schemes. A key assumption in both these approaches is that the elements in the data set can be uniformly sampled, or be uniformly permuted -- constraints that may become prohibitive when the data is processed in a decentralized or distributed fashion. In this paper, we focus on conducting iterative methods like DP-SGD in the setting of federated learning (FL) wherein the data is distributed among many devices (clients). Our main contribution is the \emph{random check-in} distributed protocol, which crucially relies only on randomized participation decisions made locally and independently by each client. It has privacy/accuracy trade-offs similar to privacy amplification by subsampling/shuffling. However, our method does not require server-initiated communication, or even knowledge of the population size. To our knowledge, this is the first privacy amplification tailored for a distributed learning framework, and it may have broader applicability beyond FL. Along the way, we extend privacy amplification by shuffling to incorporate $(\epsilon,\delta)$-DP local randomizers, and exponentially improve its guarantees. In practical regimes, this improvement allows for similar privacy and utility using data from an order of magnitude fewer users.

[1]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[2]  Vitaly Feldman,et al.  Privacy Amplification by Iteration , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[3]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[4]  Hubert Eichner,et al.  Towards Federated Learning at Scale: System Design , 2019, SysML.

[5]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[6]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[7]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[8]  Adam D. Smith,et al.  Distributed Differential Privacy via Mixnets , 2018, ArXiv.

[9]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[10]  Yu-Xiang Wang,et al.  Subsampled Rényi Differential Privacy and Analytical Moments Accountant , 2018, AISTATS.

[11]  Úlfar Erlingsson,et al.  Prochlo: Strong Privacy for Analytics in the Crowd , 2017, SOSP.

[12]  Ohad Shamir,et al.  Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes , 2012, ICML.

[13]  H. Brendan McMahan,et al.  Differentially Private Learning with Adaptive Clipping , 2019, NeurIPS.

[14]  Úlfar Erlingsson,et al.  Encode, Shuffle, Analyze Privacy Revisited: Formalizations and Empirical Evaluation , 2020, ArXiv.

[15]  Ashwin Machanavajjhala,et al.  Differentially Private Hierarchical Count-of-Counts Histograms , 2018, Proc. VLDB Endow..

[16]  Janardhan Kulkarni,et al.  Collecting Telemetry Data Privately , 2017, NIPS.

[17]  Sashank J. Reddi,et al.  AdaCliP: Adaptive Clipping for Private SGD , 2019, ArXiv.

[18]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[19]  Úlfar Erlingsson,et al.  Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity , 2018, SODA.

[20]  Borja Balle,et al.  The Privacy Blanket of the Shuffle Model , 2019, CRYPTO.

[21]  Gilles Barthe,et al.  Privacy Amplification by Subsampling: Tight Analyses via Couplings and Divergences , 2018, NeurIPS.

[22]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[23]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[24]  Raef Bassily,et al.  Private Stochastic Convex Optimization with Optimal Rates , 2019, NeurIPS.

[25]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[26]  Anand D. Sarwate,et al.  Stochastic gradient descent with differentially private updates , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[27]  Pramod Viswanath,et al.  The Composition Theorem for Differential Privacy , 2013, IEEE Transactions on Information Theory.

[28]  Li Zhang,et al.  Learning Differentially Private Language Models Without Losing Accuracy , 2017, ArXiv.

[29]  H. Brendan McMahan,et al.  Learning Differentially Private Recurrent Language Models , 2017, ICLR.

[30]  Victor Balcer,et al.  Separating Local & Shuffled Differential Privacy via Histograms , 2020, ITC.

[31]  Parvez Ahammad,et al.  LinkedIn's Audience Engagements API: A Privacy Preserving Data Analytics System at Scale , 2020, ArXiv.

[32]  Badih Ghazi,et al.  Scalable and Differentially Private Distributed Aggregation in the Shuffled Model , 2019, ArXiv.

[33]  Jeffrey F. Naughton,et al.  Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics , 2016, SIGMOD Conference.

[34]  Adam D. Smith,et al.  Is Interaction Necessary for Distributed Private Learning? , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[35]  H. Brendan McMahan,et al.  Generative Models for Effective ML on Private, Decentralized Datasets , 2019, ICLR.

[36]  Alexander J. Smola,et al.  Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo , 2015, ICML.