On perturbed steepest descent methods with inexact line search for bilevel convex optimization

We use a general framework for solving convex constrained optimization problems introduced in an earlier work to obtain algorithms for problems with a constraint set defined as the set of minimizers of a given function. Also, the algorithms allow the objective function to be decomposed as a sum of other convex functions that can be treated separately. We prove that the general algorithm converges to the optimum of the objective function over the set of minima of a convex Lipschitz-differentiable function chosen previously. When using orthogonal projections onto the convex constraints, we retrieve a Cimmino-like algorithm that converges to the optimum over the set of weighted least squares solutions. Furthermore, we show an important application of our approach to compressed sensing and inverse problems.

[1]  A. Pierro,et al.  A simultaneous projections method for linear inequalities , 1985 .

[2]  Jeffrey A. Fessler,et al.  Globally convergent image reconstruction for emission tomography using relaxed ordered subsets algorithms , 2003, IEEE Transactions on Medical Imaging.

[3]  Alvaro R. De Pierro,et al.  From convex feasibility to convex constrained optimization using block action projection methods and underrelaxation , 2009, Int. Trans. Oper. Res..

[4]  Á. R. De Pierro,et al.  Fast EM-like methods for maximum "a posteriori" estimates in emission tomography. , 2001, IEEE transactions on medical imaging.

[5]  O. Mangasarian,et al.  Serial and parallel backpropagation convergence via nonmonotone perturbed minimization , 1994 .

[6]  Alexei A. Gaivoronski,et al.  Convergence properties of backpropagation for neural nets via theory of stochastic gradient methods. Part 1 , 1994 .

[7]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[8]  Clovis C. Gonzaga Two Facts on the Convergence of the Cauchy Algorithm , 2000 .

[9]  F. Natterer The Mathematics of Computerized Tomography , 1986 .

[10]  Luo Zhi-quan,et al.  Analysis of an approximate gradient projection method with applications to the backpropagation algorithm , 1994 .

[11]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[12]  J. Hiriart-Urruty,et al.  Convex analysis and minimization algorithms , 1993 .

[13]  G. Duclos New York 1987 , 2000 .

[14]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[15]  A. Pierro,et al.  A simultaneous iterative method for computing projections on polyhedra , 1987 .

[16]  Patrice Marcotte,et al.  An overview of bilevel optimization , 2007, Ann. Oper. Res..

[17]  Frank Natterer,et al.  Mathematical methods in image reconstruction , 2001, SIAM monographs on mathematical modeling and computation.

[18]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[19]  D. Donoho For most large underdetermined systems of equations, the minimal 𝓁1‐norm near‐solution approximates the sparsest near‐solution , 2006 .

[20]  B. Logan,et al.  Signal recovery and the large sieve , 1992 .

[21]  Luigi Grippo,et al.  A class of unconstrained minimization methods for neural network training , 1994 .

[22]  A. Iusem On the convergence properties of the projected gradient method for convex optimization , 2003 .

[23]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[24]  Mikhail V. Solodov,et al.  Incremental Gradient Algorithms with Stepsizes Bounded Away from Zero , 1998, Comput. Optim. Appl..

[25]  E. Candès,et al.  Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[26]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[27]  Reginaldo J. Santos,et al.  A Cheaper Way to Compute Generalized Cross-Validation as a Stopping Rule for Linear Stationary Iterative Methods , 2003 .

[28]  Alvaro R. De Pierro,et al.  Incremental Subgradients for Constrained Convex Optimization: A Unified Framework and New Methods , 2009, SIAM J. Optim..

[29]  A. Pierro,et al.  A relaxed version of Bregman's method for convex programming , 1986 .

[30]  O. Nelles,et al.  An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[31]  L. Shepp,et al.  Maximum Likelihood Reconstruction for Emission Tomography , 1983, IEEE Transactions on Medical Imaging.

[32]  H. Malcolm Hudson,et al.  Accelerated image reconstruction using ordered subsets of projection data , 1994, IEEE Trans. Medical Imaging.

[33]  Dimitri P. Bertsekas,et al.  Incremental Subgradient Methods for Nondifferentiable Optimization , 2001, SIAM J. Optim..

[34]  John N. Tsitsiklis,et al.  Gradient Convergence in Gradient methods with Errors , 1999, SIAM J. Optim..

[35]  A. Iusem,et al.  Full convergence of the steepest descent method with inexact line searches , 1995 .

[36]  Alvaro R. De Pierro,et al.  On the Effect of Relaxation in the Convergence and Quality of Statistical Image Reconstruction for Emission Tomography Using Block-Iterative Algorithms , 2005, XVIII Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI'05).

[37]  M. Glas,et al.  Principles of Computerized Tomographic Imaging , 2000 .

[38]  Alvaro R. De Pierro,et al.  A row-action alternative to the EM algorithm for maximizing likelihood in emission tomography , 1996, IEEE Trans. Medical Imaging.

[39]  Krzysztof C. Kiwiel,et al.  Convergence of Approximate and Incremental Subgradient Methods for Convex Optimization , 2003, SIAM J. Optim..

[40]  Dimitri P. Bertsekas,et al.  A New Class of Incremental Gradient Methods for Least Squares Problems , 1997, SIAM J. Optim..

[41]  L. Shepp,et al.  A Statistical Model for Positron Emission Tomography , 1985 .

[42]  Alfred O. Hero,et al.  A Convergent Incremental Gradient Method with a Constant Step Size , 2007, SIAM J. Optim..

[43]  Paul Tseng,et al.  An Incremental Gradient(-Projection) Method with Momentum Term and Adaptive Stepsize Rule , 1998, SIAM J. Optim..

[44]  Zhi-Quan Luo,et al.  On the Convergence of the LMS Algorithm with Adaptive Learning Rate for Linear Feedforward Networks , 1991, Neural Computation.

[45]  Emmanuel J. Candès,et al.  Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies? , 2004, IEEE Transactions on Information Theory.

[46]  M. Solodov,et al.  Error Stability Properties of Generalized Gradient-Type Algorithms , 1998 .

[47]  Alvaro R. De Pierro,et al.  Convergence results for scaled gradient algorithms in positron emission tomography , 2005 .