Finding Second-Order Stationary Points Efficiently in Smooth Nonconvex Linearly Constrained Optimization Problems

This paper proposes two efficient algorithms for computing approximate secondorder stationary points (SOSPs) of problems with generic smooth non-convex objective functions and generic linear constraints. While finding (approximate) SOSPs for the class of smooth non-convex linearly constrained problems is computationally intractable, we show that generic problem instances in this class can be solved efficiently. Specifically, for a generic problem instance, we show that certain strict complementarity (SC) condition holds for all Karush-KuhnTucker (KKT) solutions. Based on this condition, we design an algorithm named Successive Negative-curvature grAdient Projection (SNAP), which performs either conventional gradient projection or some negative curvature based projection steps to find SOSPs. SNAP is a second-order algorithm that requires Õ(max{1/εG, 1/εH}) iterations to compute an (εG, εH)-SOSP, where Õ hides the iteration complexity for eigenvalue-decomposition. Building on SNAP, we propose a first-order algorithm, named SNAP, that requires O(1/ε) iterations to compute (ε, √ ε)-SOSP. The per-iteration computational complexities of our algorithms are polynomial in the number of constraints and problem dimension. To the best of our knowledge, this is the first time that first-order algorithms with polynomial per-iteration complexity and global sublinear rate are designed to find SOSPs of the important class of non-convex problems with linear constraints (almost surely).

[1]  Xin Guo,et al.  Perturbed gradient descent with occupation time , 2020, ArXiv.

[2]  Xiao Wang,et al.  Convergence to Second-Order Stationarity for Non-negative Matrix Factorization: Provably and Concurrently , 2020, ArXiv.

[3]  Meisam Razaviyayn,et al.  A Trust Region Method for Finding Second-Order Stationarity in Linearly Constrained Nonconvex Optimization , 2019, SIAM J. Optim..

[4]  Zhi-Quan Luo,et al.  A Proximal Alternating Direction Method of Multiplier for Linearly Constrained Nonconvex Minimization , 2018, SIAM J. Optim..

[5]  Stephen J. Wright,et al.  A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization , 2018, Mathematical Programming.

[6]  Maryam Fazel,et al.  Escaping from saddle points on Riemannian manifolds , 2019, NeurIPS.

[7]  Nicolas Boumal,et al.  Efficiently escaping saddle points on manifolds , 2019, NeurIPS.

[8]  Songtao Lu,et al.  PA-GD: On the Convergence of Perturbed Alternating Gradient Descent to Second-Order Stationary Points for Structured Nonconvex Optimization , 2019, ICML.

[9]  Tsung-Hui Chang,et al.  Clustering by Orthogonal Non-negative Matrix Factorization: A Sequential Non-convex Penalty Approach , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Zhize Li,et al.  SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points , 2019, NeurIPS.

[11]  Xiao Wang,et al.  Multiplicative Weights Updates as a distributed constrained optimization algorithm: Convergence to second-order stationary points almost always , 2018, ICML.

[12]  Aryan Mokhtari,et al.  A Newton-Based Method for Nonconvex Optimization with Fast Evasion of Saddle Points , 2017, SIAM J. Optim..

[13]  Daniel P. Robinson,et al.  Exploiting negative curvature in deterministic and stochastic optimization , 2017, Mathematical Programming.

[14]  J. Lee,et al.  Convergence to Second-Order Stationarity for Constrained Non-Convex Optimization , 2018, 1810.02024.

[15]  Nicholas I. M. Gould,et al.  Second-Order Optimality and Beyond: Characterization and Evaluation Complexity in Convexly Constrained Nonlinear Optimization , 2018, Found. Comput. Math..

[16]  Aryan Mokhtari,et al.  Escaping Saddle Points in Constrained Optimization , 2018, NeurIPS.

[17]  Yair Carmon,et al.  Accelerated Methods for NonConvex Optimization , 2018, SIAM J. Optim..

[18]  Mingyi Hong,et al.  Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solutions for Nonconvex Distributed Optimization , 2018, ArXiv.

[19]  Yuanzhi Li,et al.  Neon2: Finding Local Minima via First-Order Oracles , 2017, NeurIPS.

[20]  Tianbao Yang,et al.  First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time , 2017, NeurIPS.

[21]  Michael I. Jordan,et al.  Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent , 2017, COLT.

[22]  Stephen J. Wright,et al.  Complexity Analysis of Second-Order Line-Search Algorithms for Smooth Nonconvex Optimization , 2017, SIAM J. Optim..

[23]  Jinfeng Yi,et al.  Adaptive Negative Curvature Descent with Applications in Non-convex Optimization , 2018, NeurIPS.

[24]  John Wright,et al.  Using negative curvature in solving nonlinear programs , 2017, Comput. Optim. Appl..

[25]  Michael I. Jordan,et al.  Gradient Descent Can Take Exponential Time to Escape Saddle Points , 2017, NIPS.

[26]  Yi Zheng,et al.  No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis , 2017, ICML.

[27]  Michael I. Jordan,et al.  How to Escape Saddle Points Efficiently , 2017, ICML.

[28]  Tengyu Ma,et al.  Finding approximate local minima faster than gradient descent , 2016, STOC.

[29]  Nikos D. Sidiropoulos,et al.  Tensor Decomposition for Signal Processing and Machine Learning , 2016, IEEE Transactions on Signal Processing.

[30]  John Wright,et al.  A Geometric Analysis of Phase Retrieval , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[31]  Furong Huang,et al.  Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.

[32]  Jacek M. Zurada,et al.  Learning Understandable Neural Networks With Nonnegative Weight Constraints , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[33]  José Mario Martínez,et al.  Second-order negative-curvature methods for box-constrained and general constrained optimization , 2010, Comput. Optim. Appl..

[34]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[35]  Laura Palagi,et al.  Convergence to Second-Order Stationary Points of a Primal-Dual Algorithm Model for Nonlinear Programming , 2005, Math. Oper. Res..

[36]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[37]  Chih-Jen Lin,et al.  Newton's Method for Large Bound-Constrained Optimization Problems , 1999, SIAM J. Optim..

[38]  Francisco Facchinei,et al.  On the Accurate Identification of Active Constraints , 1998, SIAM J. Optim..

[39]  Francisco Facchinei,et al.  Convergence to Second Order Stationary Points in Inequality Constrained Optimization , 1998, Math. Oper. Res..

[40]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[41]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[42]  M. Lescrenier Convergence of trust region algorithms for optimization with bounds when strict complementarity does not hold , 1991 .

[43]  P. Toint,et al.  Global convergence of a class of trust region algorithms for optimization with simple bounds , 1988 .

[44]  Katta G. Murty,et al.  Some NP-complete problems in quadratic and nonlinear programming , 1987, Math. Program..

[45]  D. Bertsekas,et al.  TWO-METRIC PROJECTION METHODS FOR CONSTRAINED OPTIMIZATION* , 1984 .

[46]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[47]  C. Lanczos An iteration method for the solution of the eigenvalue problem of linear differential and integral operators , 1950 .