PAC Learning from Distributed Data in the Presence of Malicious Nodes

When data is distributed over a network, statistical learning needs to be carried out in a fully distributed fashion. When all nodes in the network are faultless and cooperate with each other, it is understood that the objective is probably approximately correct (PAC) learnable. However, when there are malicious nodes trying to sabotage learning by injecting false information in the network, PAC learnability of the objective remains an open question. In this paper, we discuss the distributed statistical learning problem when the risk function is strictly convex. We show that the model is PAC learnable in the presence of malicious nodes by proposing and analyzing a distributed learning algorithm. Experiments in non-convex settings are also performed to further discuss the PAC learnability of non-convex statistical learning problems from distributed data.

[1]  Maria-Florina Balcan,et al.  Distributed Learning, Communication Complexity and Privacy , 2012, COLT.

[2]  Rachid Guerraoui,et al.  Fast and Secure Distributed Learning in High Dimension , 2019, ArXiv.

[3]  Nitin H. Vaidya,et al.  Fault-Tolerant Multi-Agent Optimization: Optimal Iterative Distributed Algorithms , 2016, PODC.

[4]  Dan Alistarh,et al.  Byzantine Stochastic Gradient Descent , 2018, NeurIPS.

[5]  Nitin H. Vaidya,et al.  Byzantine Multi-Agent Optimization: Part I , 2015, ArXiv.

[6]  Waheed Uz Zaman Bajwa,et al.  ByRDiE: Byzantine-Resilient Distributed Coordinate Descent for Decentralized Learning , 2017, IEEE Transactions on Signal and Information Processing over Networks.

[7]  B. Gharesifard,et al.  Distributed Optimization Under Adversarial Nodes , 2016, IEEE Transactions on Automatic Control.

[8]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[9]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[10]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[11]  Cong Xie,et al.  Zeno++: robust asynchronous SGD with arbitrary number of Byzantine workers , 2019, ArXiv.

[12]  Kannan Ramchandran,et al.  Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates , 2018, ICML.

[13]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[14]  Lili Su,et al.  Distributed Statistical Machine Learning in Adversarial Settings , 2017, Proc. ACM Meas. Anal. Comput. Syst..

[15]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[16]  Nitin H. Vaidya,et al.  Fault-Tolerant Distributed Optimization (Part IV): Constrained Optimization with Arbitrary Directed Networks , 2015, ArXiv.

[17]  Qing Ling,et al.  EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, 1404.6264.

[18]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.