Faster First-Order Primal-Dual Methods for Linear Programming using Restarts and Sharpness

First-order primal-dual methods are appealing for their low memory overhead, fast iterations, and effective parallelization. However, they are often slow at finding high accuracy solutions, which creates a barrier to their use in traditional linear programming (LP) applications. This paper exploits the sharpness of primal-dual formulations of LP instances to achieve linear convergence using restarts in a general setting that applies to ADMM (alternating direction method of multipliers), PDHG (primal-dual hybrid gradient method) and EGM (extragradient method). In the special case of PDHG, without restarts we show a lower bound of Ω(κ log(1/ )), while with restarts we show an upper bound of O(κ log(1/ )), where κ is a condition number and is the desired accuracy. Moreover, the upper bound is optimal for a wide class of primal-dual methods, and applies to the strictly more general class of sharp primal-dual problems. We develop an adaptive restart scheme and verify that restarts significantly improve the ability of PDHG to find high accuracy solutions to LP problems.

[1]  Narendra Karmarkar,et al.  A new polynomial-time algorithm for linear programming , 1984, Comb..

[2]  Javier Peña,et al.  First-Order Algorithm with O(ln(1/e)) Convergence for e-Equilibrium in Two-Person Zero-Sum Games , 2008, AAAI.

[3]  David H. Gutman,et al.  The condition number of a function relative to a set , 2021, Math. Program..

[4]  Kinjal Basu,et al.  ECLIPSE: An Extreme-Scale Linear Program Solver for Web-Applications , 2020, ICML.

[5]  Stephen P. Boyd,et al.  Monotonicity and restart in fast gradient methods , 2014, 53rd IEEE Conference on Decision and Control.

[6]  Aryan Mokhtari,et al.  A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach , 2019, AISTATS.

[7]  Zheng Qu,et al.  Adaptive restart of accelerated gradient methods under local quadratic growth condition , 2017, IMA Journal of Numerical Analysis.

[8]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[9]  Sebastian Pokutta,et al.  Restarting Algorithms: Sometimes There Is Free Lunch , 2020, CPAIOR.

[10]  K. G. Ramakrishnan,et al.  Tight QAP bounds via linear programming , 2002 .

[11]  Z.-Q. Luo,et al.  Error bounds and convergence analysis of feasible descent methods: a general approach , 1993, Ann. Oper. Res..

[12]  Mohamed-Jalal Fadili,et al.  Local Convergence Properties of Douglas–Rachford and Alternating Direction Method of Multipliers , 2017, Journal of Optimization Theory and Applications.

[13]  J. Burkey,et al.  WEAK SHARP MINIMA IN MATHEMATICAL PROGRAMMING , 1993 .

[14]  Lin Xiao,et al.  An adaptive accelerated proximal gradient method and its homotopy continuation for sparse optimization , 2014, Computational Optimization and Applications.

[15]  Teodoro Alamo,et al.  Restart FISTA with Global Linear Convergence , 2019, 2019 18th European Control Conference (ECC).

[16]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[17]  Dimitri P. Bertsekas,et al.  On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..

[18]  O. Nelles,et al.  An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[19]  Luis Zuluaga,et al.  New characterizations of Hoffman constants for systems of linear constraints , 2019, Mathematical Programming.

[20]  Ivet Galabova,et al.  The ‘Idiot’ crash quadratic penalty algorithm for linear programming and its application to linearizations of quadratic assignment problems , 2019, Optim. Methods Softw..

[21]  A. Krall Applied Analysis , 1986 .

[22]  Yurii Nesterov,et al.  Gradient methods for minimizing composite functions , 2012, Mathematical Programming.

[23]  Haihao Lu,et al.  Infeasibility Detection with Primal-Dual Hybrid Gradient for Large-Scale Linear Programming , 2021, SIAM J. Optim..

[24]  Jingwei Liang,et al.  Local linear convergence analysis of Primal–Dual splitting methods , 2017, 1705.01926.

[25]  G. M. Korpelevich The extragradient method for finding saddle points and other problems , 1976 .

[26]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[27]  P. Tseng On linear convergence of iterative methods for the variational inequality problem , 1995 .

[28]  R. Rockafellar Monotone Operators and the Proximal Point Algorithm , 1976 .

[29]  E. H. Bowman Production Scheduling by the Transportation Method of Linear Programming , 1956 .

[30]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[31]  Diethard Klatte,et al.  Error bounds for solutions of linear equations and inequalities , 1995, Math. Methods Oper. Res..

[32]  D. Bertsekas,et al.  An Alternating Direction Method for Linear Programming , 1990 .

[33]  Tianbao Yang,et al.  RSG: Beating Subgradient Method without Smoothness and Strong Convexity , 2015, J. Mach. Learn. Res..

[34]  Knud D. Andersen,et al.  The Mosek Interior Point Optimizer for Linear Programming: An Implementation of the Homogeneous Algorithm , 2000 .

[35]  Patrice Marcotte,et al.  Weak Sharp Solutions of Variational Inequalities , 1998, SIAM J. Optim..

[36]  Michael C. Ferris,et al.  Finite termination of the proximal point algorithm , 1991, Math. Program..

[37]  Haihao Lu An O(s)-Resolution ODE Framework for Discrete-Time Optimization Algorithms and Applications to Convex-Concave Saddle-Point Problems , 2020 .

[38]  Mike E. Davies,et al.  Rest-Katyusha: Exploiting the Solution's Structure via Scheduled Restart Schemes , 2018, NeurIPS.

[39]  Zaïd Harchaoui,et al.  A Universal Catalyst for First-Order Optimization , 2015, NIPS.

[40]  Min Li,et al.  Adaptive Primal-Dual Splitting Methods for Statistical Learning and Image Processing , 2015, NIPS.

[41]  Dmitriy Drusvyatskiy,et al.  Subgradient Methods for Sharp Weakly Convex Functions , 2018, Journal of Optimization Theory and Applications.

[42]  James Renegar,et al.  Linear programming, complexity theory and elementary functional analysis , 1995, Math. Program..

[43]  Uriel G. Rothblum,et al.  Approximations to Solutions to Systems of Linear Inequalities , 1995, SIAM J. Matrix Anal. Appl..

[44]  Sidney W. Hess,et al.  A Linear Programming Approach to Production and Employment Scheduling , 1960 .

[45]  Patrick T. Harker,et al.  Finite-dimensional variational inequality and nonlinear complementarity problems: A survey of theory, algorithms and applications , 1990, Math. Program..

[46]  Abraham Charnes,et al.  The Stepping Stone Method of Explaining Linear Programming Calculations in Transportation Problems , 1954 .

[47]  A. Hoffman On approximate solutions of systems of linear inequalities , 1952 .

[48]  Stephen P. Boyd,et al.  Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding , 2013, Journal of Optimization Theory and Applications.

[49]  H. H. Rachford,et al.  On the numerical solution of heat conduction problems in two and three space variables , 1956 .

[50]  James Renegar,et al.  Incorporating Condition Measures into the Complexity Theory of Linear Programming , 1995, SIAM J. Optim..

[51]  Emmanuel J. Candès,et al.  Adaptive Restart for Accelerated Gradient Schemes , 2012, Foundations of Computational Mathematics.

[52]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[53]  Jacek Gondzio,et al.  Interior point methods 25 years later , 2012, Eur. J. Oper. Res..

[54]  Michael C. Ferris,et al.  A Gauss—Newton method for convex composite optimization , 1995, Math. Program..

[55]  Antonin Chambolle,et al.  On the ergodic convergence rates of a first-order primal–dual algorithm , 2016, Math. Program..

[56]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[57]  A. S. Manne Linear Programming and Sequential Decisions , 1960 .

[58]  Manuel Blum,et al.  Time Bounds for Selection , 1973, J. Comput. Syst. Sci..

[59]  Haihao Lu,et al.  New computational guarantees for solving convex optimization problems with first order methods, via a function growth condition measure , 2015, Math. Program..

[60]  Garrett J. van Ryzin,et al.  On the Choice-Based Linear Programming Model for Network Revenue Management , 2008, Manuf. Serv. Oper. Manag..

[61]  On the convergence of stochastic primal-dual hybrid gradient , 2019, 1911.00799.

[62]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[63]  Haihao Lu An O(sr)-Resolution ODE Framework for Discrete-Time Optimization Algorithms and Applications to Convex-Concave Saddle-Point Problems , 2020, ArXiv.

[64]  Yurii Nesterov,et al.  Subgradient methods for huge-scale optimization problems , 2013, Mathematical Programming.

[65]  Bingsheng He,et al.  On the O(1/n) Convergence Rate of the Douglas-Rachford Alternating Direction Method , 2012, SIAM J. Numer. Anal..