Unifying Width-Reduced Methods for Quasi-Self-Concordant Optimization

We provide several algorithms for constrained optimization of a large class of convex problems, including softmax, lp regression, and logistic regression. Central to our approach is the notion of width reduction, a technique which has proven immensely useful in the context of maximum flow [Christiano et al., STOC’11] and, more recently, lp regression [Adil et al., SODA’19], in terms of improving the iteration complexity from O(m) to Õ(m), where m is the number of rows of the design matrix, and where each iteration amounts to a linear system solve. However, a considerable drawback is that these methods require both problem-specific potentials and individually tailored analyses. As our main contribution, we initiate a new direction of study by presenting the first unified approach to achieving m-type rates. Notably, our method goes beyond these previously considered problems to more broadly capture quasi-selfconcordant losses, a class which has recently generated much interest and includes the well-studied problem of logistic regression, among others. In order to do so, we develop a unified width reduction method for carefully handling these losses based on a more general set of potentials. Additionally, we directly achieve mtype rates in the constrained setting without the need for any explicit acceleration schemes, thus naturally complementing recent work based on a ball-oracle approach [Carmon et al., NeurIPS’20].

[1]  Aaron Sidford,et al.  Faster energy maximization for faster maximum flow , 2019, STOC.

[2]  Shang-Hua Teng,et al.  Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs , 2010, STOC '11.

[3]  Jochen Könemann,et al.  Faster and simpler algorithms for multicommodity flow and other fractional packing problems , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[4]  Yin Tat Lee,et al.  An homotopy method for lp regression provably beyond self-concordance and in input-sparsity time , 2018, STOC.

[5]  Adrian Vladu,et al.  Improved Convergence for and 1 Regression via Iteratively Reweighted Least Squares , 2019 .

[6]  Lisa Fleischer,et al.  Approximating Fractional Multicommodity Flow Independent of the Number of Commodities , 2000, SIAM J. Discret. Math..

[7]  Ohad Shamir,et al.  Oracle complexity of second-order methods for smooth convex optimization , 2017, Mathematical Programming.

[8]  Aaron Sidford,et al.  Unit Capacity Maxflow in Almost $O(m^{4/3})$ Time , 2020, 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS).

[9]  Aaron Sidford,et al.  Unit Capacity Maxflow in Almost O ( m 4 / 3 ) Time , 2020 .

[10]  Martin Jaggi,et al.  Global linear convergence of Newton's method without strong-convexity or Lipschitz gradients , 2018, ArXiv.

[11]  Richard Peng,et al.  Flows in almost linear time via adaptive preconditioning , 2019, STOC.

[12]  Gary L. Miller,et al.  Runtime guarantees for regression problems , 2011, ITCS '13.

[13]  Avi Wigderson,et al.  Much Faster Algorithms for Matrix Scaling , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[14]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[15]  Aleksander Madry,et al.  Matrix Scaling and Balancing via Box Constrained Newton's Method and Interior Point Methods , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[16]  Alessandro Rudi,et al.  Globally Convergent Newton Methods for Ill-conditioned Generalized Self-concordant Losses , 2019, NeurIPS.

[17]  Deeksha Adil,et al.  Faster p-norm minimizing flows, via smoothed q-norm problems , 2020, SODA.

[18]  Yin Tat Lee,et al.  Acceleration with a Ball Optimization Oracle , 2020, NeurIPS.

[19]  Brian Bullins,et al.  Highly smooth minimization of non-smooth problems , 2020, COLT.

[20]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[21]  Quoc Tran-Dinh,et al.  Generalized self-concordant functions: a recipe for Newton-type methods , 2017, Mathematical Programming.

[22]  Éva Tardos,et al.  Fast approximation algorithms for fractional packing and covering problems , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[23]  Francis R. Bach,et al.  Self-concordant analysis for logistic regression , 2009, ArXiv.

[24]  Renato D. C. Monteiro,et al.  An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and Its Implications to Second-Order Methods , 2013, SIAM J. Optim..

[25]  Brian Bullins,et al.  Almost-linear-time Weighted $\ell_p$-norm Solvers in Slightly Dense Graphs via Sparsification , 2021, 2102.06977.

[26]  Yin Tat Lee,et al.  Near Optimal Methods for Minimizing Convex Functions with Lipschitz $p$-th Derivatives , 2019, COLT.

[27]  Richard Peng,et al.  Iterative Refinement for ℓp-norm Regression , 2019, SODA.