An iterative regularized mirror descent method for ill-posed nondifferentiable stochastic optimization.

A wide range of applications arising in machine learning and signal processing can be cast as convex optimization problems. These problems are often ill-posed, i.e., the optimal solution lacks a desired property such as uniqueness or sparsity. In the literature, to address ill-posedness, a bilevel optimization problem is considered where the goal is to find among optimal solutions of the inner level optimization problem, a solution that minimizes a secondary metric, i.e., the outer level objective function. In addressing the resulting bilevel model, the convergence analysis of most existing methods is limited to the case where both inner and outer level objectives are differentiable deterministic functions. While these assumptions may not hold in big data applications, to the best of our knowledge, no solution method equipped with complexity analysis exists to address presence of uncertainty and nondifferentiability in both levels in this class of problems. Motivated by this gap, we develop a first-order method called Iterative Regularized Stochastic Mirror Descent (IR-SMD). We establish the global convergence of the iterate generated by the algorithm to the optimal solution of the bilevel problem in an almost sure and a mean sense. We derive a convergence rate of ${\cal O}\left(1/N^{0.5-\delta}\right)$ for the inner level problem, where $\delta>0$ is an arbitrary small scalar. Numerical experiments for solving two classes of bilevel problems, including a large scale binary text classification application, are presented.

[1]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[2]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[3]  H. Robbins A Stochastic Approximation Method , 1951 .

[4]  Mikhail V. Solodov,et al.  A Bundle Method for a Class of Bilevel Nonsmooth Convex Minimization Problems , 2007, SIAM J. Optim..

[5]  A. Nedić,et al.  On Stochastic Mirror-prox Algorithms for Stochastic Cartesian Variational Inequalities: Randomized Block Coordinate and Optimal Averaging Schemes , 2016, Set-Valued and Variational Analysis.

[6]  Mikhail Solodov,et al.  An Explicit Descent Method for Bilevel Convex Optimization , 2006 .

[7]  Y. Ermoliev Stochastic quasigradient methods and their application to system optimization , 1983 .

[8]  Michael C. Ferris,et al.  Finite perturbation of convex programs , 1991 .

[9]  Alexander Shapiro,et al.  Lectures on Stochastic Programming: Modeling and Theory , 2009 .

[10]  Angelia Nedic,et al.  On stochastic gradient and subgradient methods with adaptive steplength sequences , 2011, Autom..

[11]  Jacek Gondzio,et al.  A second-order method for strongly convex ℓ1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell _1$$\end{document}-re , 2013, Mathematical Programming.

[12]  Lorenzo Rosasco,et al.  Iterative Regularization via Dual Diagonal Descent , 2016, Journal of Mathematical Imaging and Vision.

[13]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[14]  Lin Xiao,et al.  An Accelerated Randomized Proximal Coordinate Gradient Method and its Application to Regularized Empirical Risk Minimization , 2015, SIAM J. Optim..

[15]  F. Facchinei,et al.  Finite-Dimensional Variational Inequalities and Complementarity Problems , 2003 .

[16]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[17]  O. Nelles,et al.  An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[18]  Angelia Nedic,et al.  On Stochastic Subgradient Mirror-Descent Algorithm with Weighted Averaging , 2013, SIAM J. Optim..

[19]  K. Knopp,et al.  Theory and Applications of Infinite Series , 1972 .

[20]  Ankur A. Kulkarni,et al.  Recourse-based stochastic nonlinear programming: properties and Benders-SQP algorithms , 2012, Comput. Optim. Appl..

[21]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[22]  Farzad Yousefian,et al.  An Iterative Regularized Incremental Projected Subgradient Method for a Class of Bilevel Optimization Problems , 2018, 2019 American Control Conference (ACC).

[23]  J. Gondzio,et al.  A Second-Order Method for Strongly Convex L1-Regularization Problems , 2013 .

[24]  Farzad Yousefian,et al.  Optimal stochastic mirror descent methods for smooth, nonsmooth, and high-dimensional stochastic optimization , 2017 .

[25]  Stephen J. Wright Accelerated Block-coordinate Relaxation for Regularized Optimization , 2012, SIAM J. Optim..

[26]  O. Mangasarian,et al.  NONLINEAR PERTURBATION OF LINEAR PROGRAMS , 1979 .

[27]  Farzad Yousefian,et al.  A Randomized Block Coordinate Iterative Regularized Subgradient Method for High-dimensional Ill-posed Convex Optimization , 2018, 2019 American Control Conference (ACC).

[28]  Elias Salomão Helou Neto,et al.  On perturbed steepest descent methods with inexact line search for bilevel convex optimization , 2011 .

[29]  Paul Tseng,et al.  Exact Regularization of Convex Programs , 2007, SIAM J. Optim..

[30]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[31]  Shimrit Shtern,et al.  A First Order Method for Solving Convex Bilevel Optimization Problems , 2017, SIAM J. Optim..

[32]  Angelia Nedic,et al.  Self-Tuned Stochastic Approximation Schemes for Non-Lipschitzian Stochastic Multi-User Optimization and Nash Games , 2016, IEEE Transactions on Automatic Control.

[33]  Stephan Dempe,et al.  Is bilevel programming a special case of a mathematical program with complementarity constraints? , 2012, Math. Program..

[34]  Guanghui Lan,et al.  Stochastic Block Mirror Descent Methods for Nonsmooth and Stochastic Optimization , 2013, SIAM J. Optim..

[35]  Guanghui Lan,et al.  Algorithms for stochastic optimization with expectation constraints , 2016, 1604.03887.

[36]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[37]  Angelia Nedic,et al.  On smoothing, regularization, and averaging in stochastic approximation methods for stochastic variational inequality problems , 2017, Math. Program..

[38]  D. Russell Luke,et al.  Lagrange multipliers, (exact) regularization and error bounds for monotone variational inequalities , 2014, Math. Program..

[39]  Antonio Alonso Ayuso,et al.  Introduction to Stochastic Programming , 2009 .

[40]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[41]  Houyuan Jiang,et al.  Stochastic Approximation Approaches to the Stochastic Variational Inequality Problem , 2008, IEEE Transactions on Automatic Control.

[42]  Farzad Yousefian,et al.  Self-Tuned Mirror Descent Schemes for Smooth and Nonsmooth High-Dimensional Stochastic Optimization , 2017, IEEE Transactions on Automatic Control.

[43]  Lucas E. A. Simões,et al.  ϵ-subgradient algorithms for bilevel convex optimization , 2017, 1703.02648.

[44]  Isao Yamada,et al.  Minimizing the Moreau Envelope of Nonsmooth Convex Functions over the Fixed Point Set of Certain Quasi-Nonexpansive Mappings , 2011, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.

[45]  Zhiqiang Zhou,et al.  Algorithms for stochastic optimization with function or expectation constraints , 2016, Comput. Optim. Appl..

[46]  Junyi Liu,et al.  Asymptotic Results of Stochastic Decomposition for Two-Stage Stochastic Quadratic Programming , 2020, SIAM J. Optim..

[47]  Zhaosong Lu,et al.  An Accelerated Proximal Coordinate Gradient Method and its Application to Regularized Empirical Risk Minimization , 2014, 1407.1296.

[48]  Amir Beck,et al.  A first order method for finding minimal norm-like solutions of convex optimization problems , 2014, Math. Program..