论文信息 - A projection-free decentralized algorithm for non-convex optimization

A projection-free decentralized algorithm for non-convex optimization

This paper considers a decentralized projection free algorithm for non-convex optimization in high dimension. More specifically, we propose a Decentralized Frank-Wolfe (DeFW) algorithm which is suitable when high dimensional optimization constraints are difficult to handle by conventional projection/proximal-based gradient descent methods. We present conditions under which the DeFW algorithm converges to a stationary point and prove that the rate of convergence is as fast as O(l/√T), where T is the iteration number. This paper provides the first convergence guarantee for FrankWolfe methods applied to non-convex decentralized optimization. Utilizing our theoretical findings, we formulate a novel robust matrix completion problem and apply DeFW to give an efficient decentralized solution. Numerical experiments are performed on realistic and synthetic data to support our findings.

Eric Moulines | Anna Scaglione | Hoi-To Wai | Jean Lafond

[1] Qing Ling,et al. Decentralized low-rank matrix completion , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2] F. Maxwell Harper,et al. The MovieLens Datasets: History and Context , 2016, TIIS.

[3] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[4] Martin Jaggi,et al. On the Global Linear Convergence of Frank-Wolfe Optimization Variants , 2015, NIPS.

[5] John N. Tsitsiklis,et al. On distributed averaging algorithms and quantization effects , 2007, 2008 47th IEEE Conference on Decision and Control.

[6] Yang Yang,et al. A Parallel Stochastic Approximation Method for Nonconvex Multi-Agent Optimization Problems , 2014, ArXiv.

[7] Martin Jaggi,et al. Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[8] Mingyi Hong,et al. Decomposing Linearly Constrained Nonconvex Problems by a Proximal Primal Dual Approach: Algorithms, Convergence, and Applications , 2016, ArXiv.

[9] Eric Moulines,et al. Decentralized Projection-free Optimization for Convex and Non-convex Problems. , 2016 .

[10] Stephen P. Boyd,et al. Fast linear iterations for distributed averaging , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[11] Anna Scaglione,et al. Consensus on State and Time: Decentralized Regression With Asynchronous Sampling , 2015, IEEE Transactions on Signal Processing.

[12] Christian Jutten,et al. Fast Sparse Representation Based on Smoothed l0 Norm , 2007, ICA.

[13] Anna Scaglione,et al. A consensus-based decentralized algorithm for non-convex optimization with application to dictionary learning , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14] Simon Lacoste-Julien,et al. Convergence Rate of Frank-Wolfe for Non-Convex Objectives , 2016, ArXiv.

[15] Stephen P. Boyd,et al. Randomized gossip algorithms , 2006, IEEE Transactions on Information Theory.

[16] Yu. M. Ermol'ev,et al. A linearization method in limiting extremal problems , 1976 .

[17] Yang Yang,et al. A Parallel Decomposition Method for Nonconvex Stochastic Multi-Agent Optimization Problems , 2016, IEEE Transactions on Signal Processing.

[18] Angelia Nedic,et al. A new class of distributed optimization algorithms: application to regression of distributed data , 2012, Optim. Methods Softw..

[19] Soumyadip Ghosh,et al. Computing Worst-case Input Models in Stochastic Simulation , 2015 .

[20] Asuman E. Ozdaglar,et al. Constrained Consensus and Optimization in Multi-Agent Networks , 2008, IEEE Transactions on Automatic Control.

[21] Alexander J. Smola,et al. Stochastic Frank-Wolfe methods for nonconvex optimization , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[22] Ali H. Sayed,et al. Diffusion strategies for adaptation and learning over networks: an examination of distributed strategies and network behavior , 2013, IEEE Signal Processing Magazine.

[23] Pascal Bianchi,et al. Convergence of a Multi-Agent Projected Stochastic Gradient Algorithm for Non-Convex Optimization , 2011, IEEE Transactions on Automatic Control.

[24] José M. F. Moura,et al. Fast Distributed Gradient Methods , 2011, IEEE Transactions on Automatic Control.

[25] Soummya Kar,et al. Gossip Algorithms for Distributed Signal Processing , 2010, Proceedings of the IEEE.

[26] John N. Tsitsiklis,et al. Problems in decentralized decision making and computation , 1984 .

[27] Eric Moulines,et al. D-FW: Communication efficient distributed algorithms for high-dimensional sparse optimization , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28] Saeed Ghadimi,et al. Accelerated gradient methods for nonconvex nonlinear and stochastic programming , 2013, Mathematical Programming.

[29] References , 1971 .

[30] Qing Ling,et al. A Proximal Gradient Algorithm for Decentralized Composite Optimization , 2015, IEEE Transactions on Signal Processing.

[31] Volkan Cevher,et al. Convex Optimization for Big Data: Scalable, randomized, and parallel algorithms for big data analytics , 2014, IEEE Signal Processing Magazine.

[32] Gesualdo Scutari,et al. NEXT: In-Network Nonconvex Optimization , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[33] Paul Grigas,et al. New analysis and results for the Frank–Wolfe method , 2013, Mathematical Programming.

[34] Philip Wolfe,et al. An algorithm for quadratic programming , 1956 .

[35] Anna Scaglione,et al. Convergence and Applications of a Gossip-Based Gauss-Newton Algorithm , 2012, IEEE Transactions on Signal Processing.

[36] Anna Scaglione,et al. Distributed Constrained Optimization by Consensus-Based Primal-Dual Perturbation Method , 2013, IEEE Transactions on Automatic Control.