论文信息 - Scalable and Differentially Private Distributed Aggregation in the Shuffled Model

Scalable and Differentially Private Distributed Aggregation in the Shuffled Model

Federated learning promises to make machine learning feasible on distributed, private datasets by implementing gradient descent using secure aggregation methods. The idea is to compute a global weight update without revealing the contributions of individual users. Current practical protocols for secure aggregation work in an "honest but curious" setting where a curious adversary observing all communication to and from the server cannot learn any private information assuming the server is honest and follows the protocol. A more scalable and robust primitive for privacy-preserving protocols is shuffling of user data, so as to hide the origin of each data item. Highly scalable and secure protocols for shuffling, so-called mixnets, have been proposed as a primitive for privacy-preserving analytics in the Encode-Shuffle-Analyze framework by Bittau et al., which was later analytically studied by Erlingsson et al. and Cheu et al.. The recent papers by Cheu et al., and Balle et al. have given protocols for secure aggregation that achieve differential privacy guarantees in this "shuffled model". Their protocols come at a cost, though: Either the expected aggregation error or the amount of communication per user scales as a polynomial $n^{\Omega(1)}$ in the number of users $n$. In this paper we propose simple and more efficient protocol for aggregation in the shuffled model, where communication as well as error increases only polylogarithmically in $n$. Our new technique is a conceptual "invisibility cloak" that makes users' data almost indistinguishable from random noise while introducing zero distortion on the sum.

[1] Vaidy S. Sunderam,et al. Secure multiparty aggregation with differential privacy: a comparative study , 2013, EDBT '13.

[2] Borja Balle,et al. Differentially Private Summation with Multi-Message Shuffling , 2019, ArXiv.

[3] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[4] Michael Kearns,et al. Efficient noise-tolerant learning from statistical queries , 1993, STOC.

[5] Úlfar Erlingsson,et al. Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity , 2018, SODA.

[6] David P. Woodruff. Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..

[7] Borja Balle,et al. The Privacy Blanket of the Shuffle Model , 2019, CRYPTO.

[8] Úlfar Erlingsson,et al. Prochlo: Strong Privacy for Analytics in the Crowd , 2017, SOSP.

[9] Badih Ghazi,et al. Private Aggregation from Fewer Anonymous Messages , 2019, EUROCRYPT.

[10] Adam D. Smith,et al. Distributed Differential Privacy via Shuffling , 2018, IACR Cryptol. ePrint Arch..

[11] Rafail Ostrovsky,et al. Cryptography from Anonymity , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[12] Ian Goodfellow,et al. Deep Learning with Differential Privacy , 2016, CCS.

[13] Sarvar Patel,et al. Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..