Generalized ADMM in Distributed Learning via Variational Inequality

Due to the explosion in size and complexity of modern data sets and privacy concerns of data holders, it is increasingly important to be able to solve machine learning problems in distributed manners. The Alternating Direction Method of Multipliers (ADMM) through the concept of consensus variables is a practical algorithm in this context where its diverse variations and its performance have been studied in different application areas. In this paper, we study the effect of the local data sets of users in the distributed learning of ADMM. Our aim is to deploy variational inequality (VI) to attain an unified view of ADMM variations. Through the simulation results, we demonstrate how more general definitions of consensus parameters and introducing the uncertain parameters in distribute approach can help to get the better results in learning processes.

[1]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[2]  Ketan Rajawat,et al.  Distributed Inexact Successive Convex Approximation ADMM: Analysis-Part I , 2019, ArXiv.

[3]  F. Facchinei,et al.  Finite-Dimensional Variational Inequalities and Complementarity Problems , 2003 .

[4]  Convex Optimization in Signal Processing and Communications , 2010 .

[5]  Daniel Jung,et al.  Distributed Feature Selection for Multi-Class Classification Using ADMM , 2021, IEEE Control Systems Letters.

[6]  M. Bennis,et al.  L-FGADMM: Layer-Wise Federated Group ADMM for Communication Efficient Decentralized Deep Learning , 2019, 2020 IEEE Wireless Communications and Networking Conference (WCNC).

[7]  Ryoichi Nishimura,et al.  Robust Nash equilibria in N-person non-cooperative games: Uniqueness and reformulation , 2008 .

[8]  Saeedeh Parsaeefard,et al.  Robust Resource Allocation in Future Wireless Networks , 2017 .

[9]  Mehdi Bennis,et al.  Q-GADMM: Quantized Group ADMM for Communication Efficient Decentralized Machine Learning , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[11]  Hao Jiang,et al.  On the Convergence of Bregman ADMM With Variational Inequality , 2020, IEEE Access.

[12]  Mehdi Bennis,et al.  GADMM: Fast and Communication Efficient Framework for Distributed Machine Learning , 2019, J. Mach. Learn. Res..

[13]  Bingsheng He,et al.  The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent , 2014, Mathematical Programming.

[14]  Qing Ling,et al.  Weighted ADMM for Fast Decentralized Network Optimization , 2016, IEEE Transactions on Signal Processing.

[15]  Francisco Facchinei,et al.  Design of Cognitive Radio Systems Under Temperature-Interference Constraints: A Variational Inequality Approach , 2010, IEEE Transactions on Signal Processing.

[16]  A Ben Tal,et al.  ROBUST SOLUTIONS TO UNCERTAIN PROGRAMS , 1999 .

[17]  Michael I. Jordan,et al.  A General Analysis of the Convergence of ADMM , 2015, ICML.

[18]  Guilherme França,et al.  An explicit rate bound for over-relaxed ADMM , 2015, 2016 IEEE International Symposium on Information Theory (ISIT).

[19]  Liang Zhao,et al.  Nonconvex generalizations of ADMM for nonlinear equality constrained problems , 2017, ArXiv.

[20]  Francisco Facchinei,et al.  Convex Optimization, Game Theory, and Variational Inequality Theory , 2010, IEEE Signal Processing Magazine.

[21]  Saeedeh Parsaeefard,et al.  Representation of Federated Learning via Worst-Case Robust Optimization Theory , 2019, ArXiv.

[22]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[23]  Euhanna Ghadimi,et al.  Optimal Parameter Selection for the Alternating Direction Method of Multipliers (ADMM): Quadratic Problems , 2013, IEEE Transactions on Automatic Control.

[24]  João M. F. Xavier,et al.  D-ADMM: A Communication-Efficient Distributed Algorithm for Separable Optimization , 2012, IEEE Transactions on Signal Processing.

[25]  Martin Jaggi,et al.  COLA: Decentralized Linear Learning , 2018, NeurIPS.