Primal-Dual Stochastic Gradient Method for Convex Programs with Many Functional Constraints

Stochastic gradient (SG) method has been popularly applied to solve optimization problems with objective that is stochastic or an average of many functions. Most existing works on SG assume that the underlying problem is unconstrained or has an easy-to-project constraint set. In this paper, we consider problems that have a stochastic objective and also many functional constraints. For such problems, it could be extremely expensive to project a point to the feasible set, or even compute subgradient and/or function value of all constraint functions. To find solutions of these problems, we propose a novel SG method based on the augmented Lagrangian function. Within every iteration, it inquires a stochastic subgradient of the objective, a subgradient and function value of one randomly sampled constraint function, and function value of another sampled constraint function. Hence, the per-iteration complexity is low. We establish its convergence rate for convex and also strongly convex problems. It can achieve the optimal $O(1/\sqrt{k})$ convergence rate for convex case and nearly optimal $O\big((\log k)/k\big)$ rate for strongly convex case. Numerical experiments on quadratically constrained quadratic programming are conducted to demonstrate its efficiency.

[1]  Yunmei Chen,et al.  Optimal Primal-Dual Methods for a Class of Saddle Point Problems , 2013, SIAM J. Optim..

[2]  M. Neely,et al.  A Primal-Dual Parallel Method with $O(1/\epsilon)$ Convergence for Constrained Composite Convex Programs , 2017, 1708.00322.

[3]  Marco C. Campi,et al.  A Sampling-and-Discarding Approach to Chance-Constrained Optimization: Feasibility and Optimality , 2011, J. Optim. Theory Appl..

[4]  Adams Wei Yu,et al.  BLOCK-NORMALIZED GRADIENT METHOD: AN EMPIRICAL STUDY FOR TRAINING DEEP NEURAL NETWORK , 2018 .

[5]  R. Rockafellar The multiplier method of Hestenes and Powell applied to convex programming , 1973 .

[6]  Yangyang Xu,et al.  First-order methods for constrained convex programming based on linearized augmented Lagrangian function , 2017, INFORMS J. Optim..

[7]  Mengdi Wang,et al.  Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions , 2014, Mathematical Programming.

[8]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[9]  James R. Luedtke,et al.  A Sample Approximation Approach for Optimization with Probabilistic Constraints , 2008, SIAM J. Optim..

[10]  Alexander Shapiro,et al.  Lectures on Stochastic Programming: Modeling and Theory , 2009 .

[11]  A. Nemirovski,et al.  Scenario Approximations of Chance Constraints , 2006 .

[12]  C. H. Jeffrey Pang Set intersection problems: supporting hyperplanes and quadratic programming , 2015, Math. Program..

[13]  Dimitri P. Bertsekas,et al.  Stochastic First-Order Methods with Random Constraint Projection , 2016, SIAM J. Optim..

[14]  Qihang Lin,et al.  A Level-Set Method for Convex Optimization with a Feasible Solution Path , 2018, SIAM J. Optim..

[15]  Katta G. Murty,et al.  Nonlinear Programming Theory and Algorithms , 2007, Technometrics.

[16]  A. Juditsky,et al.  Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.

[17]  Giuseppe Carlo Calafiore,et al.  Uncertain convex programs: randomized solutions and confidence levels , 2005, Math. Program..

[18]  Angelia Nedic,et al.  Subgradient Methods for Saddle-Point Problems , 2009, J. Optimization Theory and Applications.

[19]  Arkadi Nemirovski,et al.  A Randomized Mirror-Prox Method for Solving Structured Large-Scale Matrix Saddle-Point Problems , 2011, SIAM J. Optim..

[20]  Maya R. Gupta,et al.  A Light Touch for Heavily Constrained SGD , 2015, COLT.

[21]  Yuantao Gu,et al.  Random Multi-Constraint Projection: Stochastic Gradient Methods for Convex Optimization with Many Constraints , 2015, ArXiv.

[22]  Guanghui Lan,et al.  Algorithms for stochastic optimization with expectation constraints , 2016, 1604.03887.

[23]  Jinfeng Yi,et al.  Stochastic Gradient Descent with Only One Projection , 2012, NIPS.

[24]  Yangyang Xu,et al.  Iteration complexity of inexact augmented Lagrangian methods for constrained convex programming , 2017, Mathematical Programming.

[25]  Francis R. Bach,et al.  Stochastic Variance Reduction Methods for Saddle-Point Problems , 2016, NIPS.

[26]  Shiqian Ma,et al.  Penalty methods with stochastic approximation for stochastic nonlinear programming , 2013, Math. Comput..

[27]  J. Xavier,et al.  Projection on the intersection of convex sets , 2016 .

[28]  R. Tyrrell Rockafellar,et al.  Augmented Lagrangians and Applications of the Proximal Point Algorithm in Convex Programming , 1976, Math. Oper. Res..

[29]  Yangyang Xu,et al.  Global convergence rates of augmented Lagrangian methods for constrained convex programming , 2017, ArXiv.

[30]  Hao Yu,et al.  A primal-dual type algorithm with the O(1/t) convergence rate for large scale constrained convex programs , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[31]  Ernest K. Ryu,et al.  Proximal-Proximal-Gradient Method , 2017, Journal of Computational Mathematics.

[32]  R. Tyrrell Rockafellar,et al.  A dual approach to solving nonlinear programming problems by unconstrained optimization , 1973, Math. Program..

[33]  Xin Tong,et al.  Neyman-Pearson Classification, Convexity and Stochastic Constraints , 2011, J. Mach. Learn. Res..