论文信息 - Solving Non-smooth Constrained Programs with Lower Complexity than \mathcal{O}(1/\varepsilon): A Primal-Dual Homotopy Smoothing Approach

Solving Non-smooth Constrained Programs with Lower Complexity than \mathcal{O}(1/\varepsilon): A Primal-Dual Homotopy Smoothing Approach

We propose a new primal-dual homotopy smoothing algorithm for a linearly constrained convex program, where neither the primal nor the dual function has to be smooth or strongly convex. The best known iteration complexity solving such a non-smooth problem is $\mathcal{O}(\varepsilon^{-1})$. In this paper, we show that by leveraging a local error bound condition on the dual function, the proposed algorithm can achieve a better primal convergence time of $\mathcal{O}\l(\varepsilon^{-2/(2+\beta)}\log_2(\varepsilon^{-1})\r)$, where $\beta\in(0,1]$ is a local error bound parameter. As an example application, we show that the distributed geometric median problem, which can be formulated as a constrained convex program, has its dual function non-smooth but satisfying the aforementioned local error bound condition with $\beta=1/2$, therefore enjoying a convergence time of $\mathcal{O}\l(\varepsilon^{-4/5}\log_2(\varepsilon^{-1})\r)$. This result improves upon the $\mathcal{O}(\varepsilon^{-1})$ convergence time bound achieved by existing distributed optimization algorithms. Simulation experiments also demonstrate the performance of our proposed algorithm.

Qing Ling | Xiaohan Wei | Hao Yu | Michael J. Neely

[1] Kannan Ramchandran,et al. Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates , 2018, ICML.

[2] Hao Yu,et al. A New Backpressure Algorithm for Joint Rate Control and Routing With Vanishing Utility Optimality Gaps and Finite Queue Lengths , 2018, IEEE/ACM Transactions on Networking.

[3] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .

[4] Volkan Cevher,et al. A Smooth Primal-Dual Optimization Framework for Nonsmooth Composite Convex Minimization , 2015, SIAM J. Optim..

[5] Xiaohan Wei,et al. A Probabilistic Sample Path Convergence Time Analysis of Drift-Plus-Penalty Algorithm for Stochastic Optimization , 2015, 1510.02973.

[6] Nate Strawn,et al. Distributed Statistical Estimation and Rates of Convergence in Normal Approximation , 2017, Electronic Journal of Statistics.

[7] Zhi-Quan Luo,et al. Extension of Hoffman's Error Bound to Polynomial Systems , 1994, SIAM J. Optim..

[8] Lin Xiao,et al. A Proximal-Gradient Homotopy Method for the Sparse Least-Squares Problem , 2012, SIAM J. Optim..

[9] Tianbao Yang,et al. RSG: Beating Subgradient Method without Smoothness and Strong Convexity , 2015, J. Mach. Learn. Res..

[10] Qing Ling,et al. On the Linear Convergence of the ADMM in Decentralized Consensus Optimization , 2013, IEEE Transactions on Signal Processing.

[11] Panos M. Pardalos,et al. Convex optimization theory , 2010, Optim. Methods Softw..

[12] J. Pang,et al. Global error bounds for convex quadratic inequality systems , 1994 .

[13] Renato D. C. Monteiro,et al. Iteration-complexity of first-order penalty methods for convex programming , 2013, Math. Program..

[14] Jong-Shi Pang,et al. Error bounds in mathematical programming , 1997, Math. Program..

[15] Martin J. Wainwright,et al. Optimality guarantees for distributed statistical estimation , 2014, 1405.0782.

[16] Defeng Sun,et al. Linear Rate Convergence of the Alternating Direction Method of Multipliers for Convex Composite Quadratic and Semi-Definite Programming , 2015, 1508.02134.

[17] Asuman E. Ozdaglar,et al. Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[18] James V. Burke,et al. A Unified Analysis of Hoffman's Bound via Fenchel Duality , 1996, SIAM J. Optim..

[19] Pablo A. Parrilo,et al. Minimizing Polynomial Functions , 2001, Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathematics and Computer Science.

[20] Yinyu Ye,et al. An Efficient Algorithm for Minimizing a Sum of Euclidean Norms with Applications , 1997, SIAM J. Optim..

[21] Wotao Yin,et al. On the Global and Linear Convergence of the Generalized Alternating Direction Method of Multipliers , 2016, J. Sci. Comput..

[22] Stephen P. Boyd,et al. Fastest Mixing Markov Chain on a Graph , 2004, SIAM Rev..

[23] Marc Teboulle,et al. An $O(1/k)$ Gradient Method for Network Resource Allocation Problems , 2014, IEEE Transactions on Control of Network Systems.

[24] Mingrui Liu,et al. ADMM without a Fixed Penalty Parameter: Faster Convergence with New Adaptive Penalization , 2017, NIPS.

[25] Volkan Cevher,et al. A Conditional Gradient Framework for Composite Convex Minimization with Applications to Semidefinite Programming , 2018, ICML.

[26] Qing Ling,et al. A Proximal Gradient Algorithm for Decentralized Composite Optimization , 2015, IEEE Transactions on Signal Processing.

[27] David B. Dunson,et al. Robust and Scalable Bayes via a Median of Subset Posterior Measures , 2014, J. Mach. Learn. Res..

[28] Johan A. K. Suykens,et al. Application of a Smoothing Technique to Decomposition in Convex Optimization , 2008, IEEE Transactions on Automatic Control.

[29] Hao Yu,et al. A Simple Parallel Algorithm with an O(1/t) Convergence Rate for General Convex Programs , 2015, SIAM J. Optim..

[30] Xiaohan Wei,et al. Primal-Dual Frank-Wolfe for Constrained Stochastic Programs with Convex and Non-convex Objectives , 2018 .

[31] Frank Plastria,et al. On the point for which the sum of the distances to n given points is minimum , 2009, Ann. Oper. Res..

[32] Volkan Cevher,et al. A Universal Primal-Dual Convex Optimization Framework , 2015, NIPS.

[33] Jakub W. Pachocki,et al. Geometric median in nearly linear time , 2016, STOC.

[34] Yurii Nesterov,et al. Universal gradient methods for convex optimization problems , 2015, Math. Program..

[35] Qing Ling,et al. Decentralized Sparse Signal Recovery for Compressive Sleeping Wireless Sensor Networks , 2010, IEEE Transactions on Signal Processing.

[36] Yurii Nesterov,et al. Smooth minimization of non-smooth functions , 2005, Math. Program..

[37] Zhao Yang Dong,et al. A fast dual proximal-gradient method for separable convex optimization with linear coupled constraints , 2016, Comput. Optim. Appl..

[38] Qing Ling,et al. On the Convergence of Decentralized Gradient Descent , 2013, SIAM J. Optim..

[39] Yurii Nesterov,et al. Complexity bounds for primal-dual methods minimizing the model of objective function , 2017, Mathematical Programming.

[40] Hao Yu,et al. On the Convergence Time of Dual Subgradient Methods for Strongly Convex Programs , 2015, IEEE Transactions on Automatic Control.

[41] Paul Tseng,et al. Approximation accuracy, gradient methods, and error bound for structured convex optimization , 2010, Math. Program..

[42] Gauthier Gidel,et al. Frank-Wolfe Splitting via Augmented Lagrangian Method , 2018, AISTATS.

[43] Wotao Yin,et al. Parallel Multi-Block ADMM with o(1 / k) Convergence , 2013, Journal of Scientific Computing.

[44] Tianbao Yang,et al. Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/\epsilon) , 2016, NIPS.

[45] M. R. Osborne,et al. A new approach to variable selection in least squares problems , 2000 .