Privacy for Free: Communication-Efficient Learning with Differential Privacy Using Sketches

Communication and privacy are two critical concerns in distributed learning. Many existing works treat these concerns separately. In this work, we argue that a natural connection exists between methods for communication reduction and privacy preservation in the context of distributed machine learning. In particular, we prove that Count Sketch, a simple method for data stream summarization, has inherent differential privacy properties. Using these derived privacy guarantees, we propose a novel sketch-based framework (DiffSketch) for distributed learning, where we compress the transmitted messages via sketches to simultaneously achieve communication efficiency and provable privacy benefits. Our evaluation demonstrates that DiffSketch can provide strong differential privacy guarantees (e.g., $\varepsilon$= 1) and reduce communication by 20-50x with only marginal decreases in accuracy. Compared to baselines that treat privacy and communication separately, DiffSketch improves absolute test accuracy by 5%-50% while offering the same privacy guarantees and communication compression.

[1]  Vyas Sekar,et al.  Enhancing the Privacy of Federated Learning with Sketching , 2019, ArXiv.

[2]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[3]  Frédo Durand,et al.  Image and depth from a conventional camera with a coded aperture , 2007, SIGGRAPH 2007.

[4]  Noga Alon,et al.  The Space Complexity of Approximating the Frequency Moments , 1999 .

[5]  Badih Ghazi,et al.  Scalable and Differentially Private Distributed Aggregation in the Shuffled Model , 2019, ArXiv.

[6]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[7]  Úlfar Erlingsson,et al.  The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks , 2018, USENIX Security Symposium.

[8]  Sanjiv Kumar,et al.  cpSGD: Communication-efficient and differentially-private distributed SGD , 2018, NeurIPS.

[9]  Graham Cormode,et al.  Sketch Techniques for Approximate Query Processing , 2010 .

[10]  Ivo F. Sbalzarini,et al.  Gradient Distribution Priors for Biomedical Image Processing , 2014, ArXiv.

[11]  Minlan Yu,et al.  Software Defined Traffic Measurement with OpenSketch , 2013, NSDI.

[12]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[13]  H. Brendan McMahan,et al.  Federated Heavy Hitters Discovery with Differential Privacy , 2019, AISTATS.

[14]  Divesh Srivastava,et al.  Finding hierarchical heavy hitters in streaming data , 2008, TKDD.

[15]  Moses Charikar,et al.  Finding frequent items in data streams , 2004, Theor. Comput. Sci..

[16]  H. Brendan McMahan,et al.  Learning Differentially Private Recurrent Language Models , 2017, ICLR.

[17]  Divyakant Agrawal,et al.  Efficient Computation of Frequent and Top-k Elements in Data Streams , 2005, ICDT.

[18]  Paavo Parmas,et al.  Total stochastic gradient algorithms and applications in reinforcement learning , 2019, NeurIPS.

[19]  Tong Yang,et al.  SketchML: Accelerating Distributed Machine Learning with Data Sketches , 2018, SIGMOD Conference.

[20]  Mariana Raykova,et al.  Secure Computation for Machine Learning With SPDZ , 2019, ArXiv.

[21]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[22]  Sebastian Caldas,et al.  Expanding the Reach of Federated Learning by Reducing Client Resource Requirements , 2018, ArXiv.

[23]  Roy Friedman,et al.  Nitrosketch: robust and general sketch-based monitoring in software switches , 2019, SIGCOMM.

[24]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[25]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[26]  William J. Dally,et al.  Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training , 2017, ICLR.

[27]  William T. Freeman,et al.  Removing camera shake from a single photograph , 2006, SIGGRAPH 2006.

[28]  Sebastian Caldas,et al.  LEAF: A Benchmark for Federated Settings , 2018, ArXiv.

[29]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[30]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[31]  John Duchi,et al.  Lower Bounds for Locally Private Estimation via Communication Complexity , 2019, COLT.

[32]  Alexander J. Smola,et al.  AIDE: Fast and Communication Efficient Distributed Optimization , 2016, ArXiv.

[33]  Larry A. Wasserman,et al.  Differential privacy with compression , 2009, 2009 IEEE International Symposium on Information Theory.

[34]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[35]  Vladimir Braverman,et al.  One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon , 2016, SIGCOMM.

[36]  Michael I. Jordan,et al.  CoCoA: A General Framework for Communication-Efficient Distributed Optimization , 2016, J. Mach. Learn. Res..

[37]  Raef Bassily,et al.  Practical Locally Private Heavy Hitters , 2017, NIPS.

[38]  Sebastian U. Stich,et al.  Local SGD Converges Fast and Communicates Little , 2018, ICLR.

[39]  Emiliano De Cristofaro,et al.  Efficient Private Statistics with Succinct Sketches , 2015, NDSS.

[40]  Saeed Ghadimi,et al.  Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..

[41]  Jeffrey Li,et al.  Differentially Private Meta-Learning , 2020, ICLR.

[42]  Vladimir Braverman,et al.  Communication-efficient distributed SGD with Sketching , 2019, NeurIPS.

[43]  Hubert Eichner,et al.  Towards Federated Learning at Scale: System Design , 2019, SysML.

[44]  Anand D. Sarwate,et al.  Randomized requantization with local differential privacy , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[45]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[46]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[47]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[48]  Anastasios Kyrillidis,et al.  Compressing Gradient Optimizers via Count-Sketches , 2019, ICML.