Towards Byzantine-resilient Learning in Decentralized Systems

With the proliferation of IoT and edge computing, decentralized learning is becoming more promising. When designing a distributed learning system, one major challenge to consider is Byzantine Fault Tolerance (BFT). Past works have researched Byzantine-resilient solutions for centralized distributed learning. However, there are currently no satisfactory solutions with strong efficiency and security in decentralized systems. In this paper, we propose a novel algorithm, Mozi, to achieve BFT in decentralized learning systems. Specifically, Mozi provides a uniform Byzantine-resilient aggregation rule for benign nodes to select the useful parameter updates and filter out the malicious ones in each training iteration. It guarantees that each benign node in a decentralized system can train a correct model under very strong Byzantine attacks with an arbitrary number of faulty nodes. We perform the theoretical analysis to prove the uniform convergence of our proposed algorithm. Experimental evaluations demonstrate the high security and efficiency of Mozi compared to all existing solutions.

[1]  Indranil Gupta,et al.  Generalized Byzantine-tolerant SGD , 2018, ArXiv.

[2]  Rachid Guerraoui,et al.  Personalized and Private Peer-to-Peer Machine Learning , 2017, AISTATS.

[3]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[4]  Yue Wang,et al.  E2-Train: Training State-of-the-art CNNs with Over 80% Energy Savings , 2019, NeurIPS.

[5]  Moran Baruch,et al.  A Little Is Enough: Circumventing Defenses For Distributed Learning , 2019, NeurIPS.

[6]  Waheed Uz Zaman Bajwa,et al.  ByRDiE: Byzantine-Resilient Distributed Coordinate Descent for Decentralized Learning , 2017, IEEE Transactions on Signal and Information Processing over Networks.

[7]  David Fridovich-Keil,et al.  Fully Decentralized Policies for Multi-Agent Systems: An Information Theoretic Approach , 2017, NIPS.

[8]  Hanlin Tang,et al.  Communication Compression for Decentralized Training , 2018, NeurIPS.

[9]  Ruby B. Lee,et al.  Analyzing Cache Side Channels Using Deep Neural Networks , 2018, ACSAC.

[10]  Minghong Fang,et al.  Local Model Poisoning Attacks to Byzantine-Robust Federated Learning , 2019, USENIX Security Symposium.

[11]  Tara Javidi,et al.  Decentralized Bayesian Learning over Graphs , 2019, ArXiv.

[12]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[13]  Lei Ma,et al.  Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning , 2020, AAAI.

[14]  Hubert Eichner,et al.  Towards Federated Learning at Scale: System Design , 2019, SysML.

[15]  Rachid Guerraoui,et al.  The Hidden Vulnerability of Distributed Learning in Byzantium , 2018, ICML.

[16]  Rachid Guerraoui,et al.  Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent , 2017, NIPS.

[17]  Rachid Guerraoui,et al.  Asynchronous Byzantine Machine Learning ( the case of SGD ) Supplementary Material , 2022 .

[18]  Wei Zhang,et al.  Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent , 2017, NIPS.

[19]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[20]  Dimitris S. Papailiopoulos,et al.  DRACO: Byzantine-resilient Distributed Training via Redundant Gradients , 2018, ICML.

[21]  Cong Xie,et al.  Zeno++: robust asynchronous SGD with arbitrary number of Byzantine workers , 2019, ArXiv.

[22]  Indranil Gupta,et al.  Zeno++: Robust Fully Asynchronous SGD , 2020, ICML.

[23]  John N. Tsitsiklis,et al.  Problems in decentralized decision making and computation , 1984 .

[24]  Haijun Wang,et al.  DiffChaser: Detecting Disagreements for Deep Neural Networks , 2019, IJCAI.

[25]  Mianxiong Dong,et al.  Learning IoT in Edge: Deep Learning for the Internet of Things with Edge Computing , 2018, IEEE Network.

[26]  Indranil Gupta,et al.  Zeno: Distributed Stochastic Gradient Descent with Suspicion-based Fault-tolerance , 2018, ICML.

[27]  Waheed U. Bajwa,et al.  BRIDGE: Byzantine-Resilient Decentralized Gradient Descent , 2019, IEEE Transactions on Signal and Information Processing over Networks.

[28]  Hamed Haddadi,et al.  Deep Learning in Mobile and Wireless Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[29]  Indranil Gupta,et al.  SLSGD: Secure and Efficient Distributed On-device Machine Learning , 2019, ECML/PKDD.

[30]  Mohammad S. Obaidat,et al.  Deep Learning-Based Content Centric Data Dissemination Scheme for Internet of Vehicles , 2018, 2018 IEEE International Conference on Communications (ICC).