论文信息 - Ghost Penalties in Nonconvex Constrained Optimization: Diminishing Stepsizes and Iteration Complexity

Ghost Penalties in Nonconvex Constrained Optimization: Diminishing Stepsizes and Iteration Complexity

We consider, for the first time, general diminishing stepsize methods for nonconvex, constrained optimization problems. We show that by using directions obtained in an SQP-like fashion convergence to generalized stationary points can be proved. In order to do so, we make use of classical penalty functions in an uncon- ventional way. In particular, penalty functions only enter in the theoretical analysis of convergence while the algorithm itself is penalty-free. We then consider the iteration complexity of this method and some variants where the stepsize is either kept constant or decreased according to very simple rules. We establish convergence to $\delta$-approximate stationary points in at most $O(\delta^{-2})$, $O(\delta^{-3})$, or $O(\delta^{-4})$ iterations according to the assumptions made on the problem. These complexity results complement nicely the very few existing results in the field.

[1] G. A. Garreau,et al. Mathematical Programming and Control Theory , 1979, Mathematical Gazette.

[2] R. Rockafellar. Lagrange multipliers and subderivatives of optimal value functions in nonlinear programming , 1982 .

[3] R. Rockafellar,et al. Lipschitzian properties of multifunctions , 1985 .

[4] Naum Zuselevich Shor,et al. Minimization Methods for Non-Differentiable Functions , 1985, Springer Series in Computational Mathematics.

[5] L. Grippo,et al. An exact penalty function method with global convergence properties for nonlinear programming problems , 1986, Math. Program..

[6] Luigi Grippo,et al. On the exactness of a class of nondifferentiable penalty functions , 1988 .

[7] L. Grippo,et al. Exact penalty functions in constrained optimization , 1989 .

[8] James V. Burke,et al. A robust sequential quadratic programming method , 1989, Math. Program..

[9] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .

[10] J. Burke. A sequential quadratic programming method for potentially infeasible mathematical programs , 1989 .

[11] Stefano Lucidi,et al. New Results on a Continuously Differentiable Exact Penalty Function , 1992, SIAM J. Optim..

[12] James V. Burke,et al. A Robust Trust Region Method for Constrained Nonlinear Programming Problems , 1992, SIAM J. Optim..

[13] Stephen A. Vavasis,et al. Black-Box Complexity of Local Minimization , 1993, SIAM J. Optim..

[14] Nguyen Dong Yen,et al. Holder continuity of solutions to a parametric variational inequality , 1995 .

[15] Ya-Xiang Yuan,et al. On the convergence of a new trust region algorithm , 1995 .

[16] O. Nelles,et al. An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[17] Francisco Facchinei,et al. Robust Recursive Quadratic Programming Algorithm Model with Global and Superlinear Convergence Properties , 1997 .

[18] Mahmoud El-Alem. A Global Convergence Theory for Dennis, El-Alem, and Maciel's Class of Trust-Region Algorithms for Constrained Optimization without Assuming Regularity , 1999, SIAM J. Optim..

[19] Ya-Xiang Yuan,et al. A Robust Algorithm for Optimization with General Equality and Inequality Constraints , 2000, SIAM J. Sci. Comput..

[20] O. SIAMJ.,et al. A CLASS OF GLOBALLY CONVERGENT OPTIMIZATION METHODS BASED ON CONSERVATIVE CONVEX SEPARABLE APPROXIMATIONS∗ , 2002 .

[21] F. Facchinei,et al. Finite-Dimensional Variational Inequalities and Complementarity Problems , 2003 .

[22] Mikhail V. Solodov,et al. On the Sequential Quadratically Constrained Quadratic Programming Methods , 2004, Math. Oper. Res..

[23] D. Hunter,et al. A Tutorial on MM Algorithms , 2004 .

[24] Jie Sun,et al. A Robust Primal-Dual Interior-Point Algorithm for Nonlinear Programs , 2004, SIAM J. Optim..

[25] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[26] Yurii Nesterov,et al. Cubic regularization of Newton method and its global performance , 2006, Math. Program..

[27] Wei Yu,et al. Joint optimization of relay strategies and resource allocations in cooperative cellular networks , 2006, IEEE Journal on Selected Areas in Communications.

[28] Mikhail V. Solodov,et al. Global convergence of an SQP method without boundedness assumptions on any of the iterative sequences , 2009 .

[29] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .

[30] Marc Teboulle,et al. A Moving Balls Approximation Method for a Class of Smooth Constrained Minimization Problems , 2010, SIAM J. Optim..

[31] Asuman E. Ozdaglar,et al. Constrained Consensus and Optimization in Multi-Agent Networks , 2008, IEEE Transactions on Automatic Control.

[32] Amir Beck,et al. A sequential parametric convex approximation method with applications to nonconvex truss topology design problems , 2010, J. Glob. Optim..

[33] Ya-Xiang Yuan,et al. A Sequential Quadratic Programming Method Without A Penalty Function or a Filter for Nonlinear Equality Constrained Optimization , 2011, SIAM J. Optim..

[34] Nicholas I. M. Gould,et al. On the Evaluation Complexity of Composite Function Minimization with Applications to Nonconvex Nonlinear Programming , 2011, SIAM J. Optim..

[35] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.

[36] James V. Burke,et al. Epi-convergent Smoothing with Applications to Convex Composite Functions , 2012, SIAM J. Optim..

[37] Julien Mairal,et al. Optimization with First-Order Surrogate Functions , 2013, ICML.

[38] Nicholas I. M. Gould,et al. On the Evaluation Complexity of Cubic Regularization Methods for Potentially Rank-Deficient Nonlinear Least-Squares Problems and Its Relevance to Constrained Nonlinear Optimization , 2013, SIAM J. Optim..

[39] Alfred Auslender,et al. An Extended Sequential Quadratically Constrained Quadratic Programming Algorithm for Nonlinear, Semidefinite, and Second-Order Cone Programming , 2013, J. Optim. Theory Appl..

[40] Zhi-Quan Luo,et al. A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization , 2012, SIAM J. Optim..

[41] Maya R. Gupta,et al. Training highly multiclass classifiers , 2014, J. Mach. Learn. Res..

[42] Nicholas I. M. Gould,et al. On the complexity of finding first-order critical points in constrained nonlinear optimization , 2014, Math. Program..

[43] Francisco Facchinei,et al. Decomposition by Partial Linearization: Parallel Optimization of Multi-Agent Systems , 2013, IEEE Transactions on Signal Processing.

[44] Francisco Facchinei,et al. Parallel Selective Algorithms for Nonconvex Big Data Optimization , 2014, IEEE Transactions on Signal Processing.

[45] Francisco Facchinei,et al. Hybrid Random/Deterministic Parallel Algorithms for Convex and Nonconvex Big Data Optimization , 2014, IEEE Transactions on Signal Processing.

[46] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[47] Dimitri P. Bertsekas,et al. Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey , 2015, ArXiv.

[48] Gesualdo Scutari,et al. NEXT: In-Network Nonconvex Optimization , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[49] Franziska Wulf,et al. Minimization Methods For Non Differentiable Functions , 2016 .

[50] Stephen P. Boyd,et al. Variations and extension of the convex–concave procedure , 2016 .

[51] Edouard Pauwels,et al. Majorization-Minimization Procedures and Convergence of SQP Methods for Semi-Algebraic and Tame Programs , 2014, Math. Oper. Res..

[52] José Mario Martínez,et al. Evaluation Complexity for Nonlinear Constrained Optimization Using Unscaled KKT Conditions and High-Order Models , 2016, SIAM J. Optim..

[53] Zhi-Quan Luo,et al. A Unified Algorithmic Framework for Block-Structured Optimization Involving Big Data: With applications in machine learning and signal processing , 2015, IEEE Signal Processing Magazine.

[54] Prabhu Babu,et al. Majorization-Minimization Algorithms in Signal Processing, Communications, and Machine Learning , 2017, IEEE Transactions on Signal Processing.

[55] Francisco Facchinei,et al. Parallel and Distributed Methods for Constrained Nonconvex Optimization-Part II: Applications in Communications and Machine Learning , 2017, IEEE Transactions on Signal Processing.

[56] Francisco Facchinei,et al. Feasible methods for nonconvex nonsmooth problems with applications in green communications , 2017, Math. Program..

[57] José Mario Martínez,et al. On High-order Model Regularization for Constrained Optimization , 2017, SIAM J. Optim..

[58] Behrouz Touri,et al. Non-Convex Distributed Optimization , 2015, IEEE Transactions on Automatic Control.

[59] Shiqian Ma,et al. Stochastic Quasi-Newton Methods for Nonconvex Stochastic Optimization , 2014, SIAM J. Optim..

[60] Francisco Facchinei,et al. Parallel and Distributed Methods for Constrained Nonconvex Optimization—Part I: Theory , 2016, IEEE Transactions on Signal Processing.

[61] Nicholas I. M. Gould,et al. Corrigendum: On the complexity of finding first-order critical points in constrained nonlinear optimization , 2017, Math. Program..

[62] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..

[63] Wotao Yin,et al. On Nonconvex Decentralized Gradient Descent , 2016, IEEE Transactions on Signal Processing.

[64] Dmitriy Drusvyatskiy,et al. Stochastic model-based minimization of weakly convex functions , 2018, SIAM J. Optim..

[65] Nicholas I. M. Gould,et al. Optimality of orders one to three and beyond: characterization and evaluation complexity in constrained nonconvex optimization , 2017, J. Complex..

[66] P. Toint,et al. Evaluation Complexity Bounds for Smooth Constrained Nonlinear Optimization Using Scaled KKT Conditions and High-Order Models , 2019, Approximation and Optimization.

[67] Brian M. Sadler,et al. Decentralized Dictionary Learning Over Time-Varying Digraphs , 2018, J. Mach. Learn. Res..

[68] Dmitriy Drusvyatskiy,et al. Stochastic Subgradient Method Converges on Tame Functions , 2018, Foundations of Computational Mathematics.