论文信息 - Finding Second-Order Stationary Points Efficiently in Smooth Nonconvex Linearly Constrained Optimization Problems - 字舞流文

Finding Second-Order Stationary Points Efficiently in Smooth Nonconvex Linearly Constrained Optimization Problems

This paper proposes two efficient algorithms for computing approximate secondorder stationary points (SOSPs) of problems with generic smooth non-convex objective functions and generic linear constraints. While finding (approximate) SOSPs for the class of smooth non-convex linearly constrained problems is computationally intractable, we show that generic problem instances in this class can be solved efficiently. Specifically, for a generic problem instance, we show that certain strict complementarity (SC) condition holds for all Karush-KuhnTucker (KKT) solutions. Based on this condition, we design an algorithm named Successive Negative-curvature grAdient Projection (SNAP), which performs either conventional gradient projection or some negative curvature based projection steps to find SOSPs. SNAP is a second-order algorithm that requires Õ(max{1/εG, 1/εH}) iterations to compute an (εG, εH)-SOSP, where Õ hides the iteration complexity for eigenvalue-decomposition. Building on SNAP, we propose a first-order algorithm, named SNAP, that requires O(1/ε) iterations to compute (ε, √ ε)-SOSP. The per-iteration computational complexities of our algorithms are polynomial in the number of constraints and problem dimension. To the best of our knowledge, this is the first time that first-order algorithms with polynomial per-iteration complexity and global sublinear rate are designed to find SOSPs of the important class of non-convex problems with linear constraints (almost surely).

Songtao Lu | Mingyi Hong | Bo Yang | Meisam Razaviyayn | Kejun Huang | Songtao Lu | Mingyi Hong | Meisam Razaviyayn | Kejun Huang | Bo Yang

[1] Xin Guo,et al. Perturbed gradient descent with occupation time , 2020, ArXiv.

[2] Xiao Wang,et al. Convergence to Second-Order Stationarity for Non-negative Matrix Factorization: Provably and Concurrently , 2020, ArXiv.

[3] Meisam Razaviyayn,et al. A Trust Region Method for Finding Second-Order Stationarity in Linearly Constrained Nonconvex Optimization , 2019, SIAM J. Optim..

[4] Zhi-Quan Luo,et al. A Proximal Alternating Direction Method of Multiplier for Linearly Constrained Nonconvex Minimization , 2018, SIAM J. Optim..

[5] Stephen J. Wright,et al. A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization , 2018, Mathematical Programming.

[6] Maryam Fazel,et al. Escaping from saddle points on Riemannian manifolds , 2019, NeurIPS.

[7] Nicolas Boumal,et al. Efficiently escaping saddle points on manifolds , 2019, NeurIPS.

[8] Songtao Lu,et al. PA-GD: On the Convergence of Perturbed Alternating Gradient Descent to Second-Order Stationary Points for Structured Nonconvex Optimization , 2019, ICML.

[9] Tsung-Hui Chang,et al. Clustering by Orthogonal Non-negative Matrix Factorization: A Sequential Non-convex Penalty Approach , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10] Zhize Li,et al. SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points , 2019, NeurIPS.

[11] Xiao Wang,et al. Multiplicative Weights Updates as a distributed constrained optimization algorithm: Convergence to second-order stationary points almost always , 2018, ICML.

[12] Aryan Mokhtari,et al. A Newton-Based Method for Nonconvex Optimization with Fast Evasion of Saddle Points , 2017, SIAM J. Optim..

[13] Daniel P. Robinson,et al. Exploiting negative curvature in deterministic and stochastic optimization , 2017, Mathematical Programming.

[14] J. Lee,et al. Convergence to Second-Order Stationarity for Constrained Non-Convex Optimization , 2018, 1810.02024.

[15] Nicholas I. M. Gould,et al. Second-Order Optimality and Beyond: Characterization and Evaluation Complexity in Convexly Constrained Nonlinear Optimization , 2018, Found. Comput. Math..

[16] Aryan Mokhtari,et al. Escaping Saddle Points in Constrained Optimization , 2018, NeurIPS.

[17] Yair Carmon,et al. Accelerated Methods for NonConvex Optimization , 2018, SIAM J. Optim..

[18] Mingyi Hong,et al. Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solutions for Nonconvex Distributed Optimization , 2018, ArXiv.

[19] Yuanzhi Li,et al. Neon2: Finding Local Minima via First-Order Oracles , 2017, NeurIPS.

[20] Tianbao Yang,et al. First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time , 2017, NeurIPS.

[21] Michael I. Jordan,et al. Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent , 2017, COLT.

[22] Stephen J. Wright,et al. Complexity Analysis of Second-Order Line-Search Algorithms for Smooth Nonconvex Optimization , 2017, SIAM J. Optim..

[23] Jinfeng Yi,et al. Adaptive Negative Curvature Descent with Applications in Non-convex Optimization , 2018, NeurIPS.

[24] John Wright,et al. Using negative curvature in solving nonlinear programs , 2017, Comput. Optim. Appl..

[25] Michael I. Jordan,et al. Gradient Descent Can Take Exponential Time to Escape Saddle Points , 2017, NIPS.

[26] Yi Zheng,et al. No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis , 2017, ICML.

[27] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.

[28] Tengyu Ma,et al. Finding approximate local minima faster than gradient descent , 2016, STOC.

[29] Nikos D. Sidiropoulos,et al. Tensor Decomposition for Signal Processing and Machine Learning , 2016, IEEE Transactions on Signal Processing.

[30] John Wright,et al. A Geometric Analysis of Phase Retrieval , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[31] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.

[32] Jacek M. Zurada,et al. Learning Understandable Neural Networks With Nonnegative Weight Constraints , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[33] José Mario Martínez,et al. Second-order negative-curvature methods for box-constrained and general constrained optimization , 2010, Comput. Optim. Appl..

[34] R. Tibshirani,et al. Sparse Principal Component Analysis , 2006 .

[35] Laura Palagi,et al. Convergence to Second-Order Stationary Points of a Primal-Dual Algorithm Model for Nonlinear Programming , 2005, Math. Oper. Res..

[36] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[37] Chih-Jen Lin,et al. Newton's Method for Large Bound-Constrained Optimization Problems , 1999, SIAM J. Optim..

[38] Francisco Facchinei,et al. On the Accurate Identification of Active Constraints , 1998, SIAM J. Optim..

[39] Francisco Facchinei,et al. Convergence to Second Order Stationary Points in Inequality Constrained Optimization , 1998, Math. Oper. Res..

[40] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[41] Jonathan J. Hull,et al. A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[42] M. Lescrenier. Convergence of trust region algorithms for optimization with bounds when strict complementarity does not hold , 1991 .

[43] P. Toint,et al. Global convergence of a class of trust region algorithms for optimization with simple bounds , 1988 .

[44] Katta G. Murty,et al. Some NP-complete problems in quadratic and nonlinear programming , 1987, Math. Program..

[45] D. Bertsekas,et al. TWO-METRIC PROJECTION METHODS FOR CONSTRAINED OPTIMIZATION* , 1984 .

[46] Dimitri P. Bertsekas,et al. Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[47] C. Lanczos. An iteration method for the solution of the eigenvalue problem of linear differential and integral operators , 1950 .