On the steplength selection in gradient methods for unconstrained optimization

Abstract The seminal paper by Barzilai and Borwein (1988) has given rise to an extensive investigation, leading to the development of effective gradient methods. Several steplength rules have been first designed for unconstrained quadratic problems and then extended to general nonlinear optimization problems. These rules share the common idea of attempting to capture, in an inexpensive way, some second-order information. However, the convergence theory of the gradient methods using the previous rules does not explain their effectiveness, and a full understanding of their practical behaviour is still missing. In this work we investigate the relationships between the steplengths of a variety of gradient methods and the spectrum of the Hessian of the objective function, providing insight into the computational effectiveness of the methods, for both quadratic and general unconstrained optimization problems. Our study also identifies basic principles for designing effective gradient methods.

[1]  Roger Fletcher,et al.  A limited memory steepest descent method , 2011, Mathematical Programming.

[2]  Ya-xiang,et al.  A NEW STEPSIZE FOR THE STEEPEST DESCENT METHOD , 2006 .

[3]  Gaohang Yu,et al.  On Nonmonotone Chambolle Gradient Projection Algorithms for Total Variation Image Restoration , 2009, Journal of Mathematical Imaging and Vision.

[4]  Luc Pronzato,et al.  Gradient algorithms for quadratic optimization with fast convergence rates , 2011, Comput. Optim. Appl..

[5]  John Darzentas,et al.  Problem Complexity and Method Efficiency in Optimization , 1983 .

[6]  Luc Pronzato,et al.  An asymptotically optimal gradient algorithm for quadratic optimization with low computational cost , 2012, Optimization Letters.

[7]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[8]  Jorge Nocedal,et al.  On the Behavior of the Gradient Norm in the Steepest Descent Method , 2002, Comput. Optim. Appl..

[9]  V. Marčenko,et al.  DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES , 1967 .

[10]  Ya-Xiang Yuan,et al.  Alternate minimization gradient method , 2003 .

[11]  Ya-Xiang Yuan,et al.  Analysis of monotone gradient methods , 2005 .

[12]  Marcos Raydan,et al.  Relaxed Steepest Descent and Cauchy-Barzilai-Borwein Method , 2002, Comput. Optim. Appl..

[13]  William W. Hager,et al.  A Nonmonotone Line Search Technique and Its Application to Unconstrained Optimization , 2004, SIAM J. Optim..

[14]  L. Zanni,et al.  New adaptive stepsize selections in gradient methods , 2008 .

[15]  J. M. Martínez,et al.  Gradient Method with Retards and Generalizations , 1998 .

[16]  José Mario Martínez,et al.  Nonmonotone Spectral Projected Gradient Methods on Convex Sets , 1999, SIAM J. Optim..

[17]  W. Hager,et al.  The cyclic Barzilai-–Borwein method for unconstrained optimization , 2006 .

[18]  L. Liao,et al.  R-linear convergence of the Barzilai and Borwein gradient method , 2002 .

[19]  Marcos Raydan,et al.  The Barzilai and Borwein Gradient Method for the Large Scale Unconstrained Minimization Problem , 1997, SIAM J. Optim..

[20]  Roger Fletcher,et al.  A Rapidly Convergent Descent Method for Minimization , 1963, Comput. J..

[21]  Clóvis C. Gonzaga,et al.  On the steepest descent algorithm for quadratic functions , 2016, Comput. Optim. Appl..

[22]  J. Borwein,et al.  Two-Point Step Size Gradient Methods , 1988 .

[23]  Roger Fletcher,et al.  On the asymptotic behaviour of some new gradient methods , 2005, Math. Program..

[24]  Luc Pronzato,et al.  Estimation of Spectral Bounds in Gradient Algorithms , 2013 .

[25]  Gene H. Golub,et al.  Matrix computations , 1983 .

[26]  A. Iusem,et al.  Full convergence of the steepest descent method with inexact line searches , 1995 .

[27]  Mário A. T. Figueiredo,et al.  Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems , 2007, IEEE Journal of Selected Topics in Signal Processing.

[28]  Daniela di Serafino,et al.  On the regularizing behavior of the SDA and SDC gradient methods in the solution of linear ill-posed problems , 2016, J. Comput. Appl. Math..

[29]  E. Birgin,et al.  Estimation of the optical constants and the thickness of thin films using unconstrained optimization , 2006 .

[30]  L. Zanni,et al.  A scaled gradient projection method for constrained image deblurring , 2008 .

[31]  L. Grippo,et al.  A nonmonotone line search technique for Newton's method , 1986 .

[32]  William W. Hager,et al.  An efficient gradient method using the Yuan steplength , 2014, Comput. Optim. Appl..

[33]  Luca Zanni,et al.  Gradient projection methods for quadratic programs and applications in training support vector machines , 2005, Optim. Methods Softw..

[34]  Yuhong Dai On the Nonmonotone Line Search , 2002 .

[35]  Yuhong Dai Alternate step gradient method , 2003 .

[36]  L. Zanni,et al.  Accelerating gradient projection methods for ℓ1-constrained signal recovery by steplength selection rules , 2009 .

[37]  Roger Fletcher,et al.  On the Barzilai-Borwein Method , 2005 .

[38]  H. Akaike On a successive transformation of probability distribution and its application to the analysis of the optimum gradient method , 1959 .

[39]  G. Toraldo,et al.  On spectral properties of steepest descent methods , 2013 .

[40]  Clóvis C. Gonzaga,et al.  On the worst case performance of the steepest descent algorithm for quadratic functions , 2016, Math. Program..

[41]  Hongchao Zhang,et al.  Adaptive Two-Point Stepsize Gradient Algorithm , 2001, Numerical Algorithms.

[42]  Luca Zanni,et al.  A New Steplength Selection for Scaled Gradient Methods with Application to Image Deblurring , 2014, J. Sci. Comput..

[43]  Daniela di Serafino,et al.  On the Application of the Spectral Projected Gradient Method in Image Segmentation , 2015, Journal of Mathematical Imaging and Vision.

[44]  Wei Guo,et al.  R-Linear Convergence of Limited Memory Steepest Descent , 2016 .

[45]  G. Zanghirati,et al.  Towards real-time image deconvolution: application to confocal and STED microscopy , 2013, Scientific Reports.

[46]  P. Toint Some numerical results using a sparse matrix updating formula in unconstrained optimization , 1978 .

[47]  Roger Fletcher,et al.  New algorithms for singly linearly constrained quadratic programs subject to lower and upper bounds , 2006, Math. Program..

[48]  Bin Zhou,et al.  Gradient Methods with Adaptive Step-Sizes , 2006, Comput. Optim. Appl..

[49]  Valeria Ruggiero,et al.  A note on spectral properties of some gradient methods , 2016 .

[50]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[51]  Stephen J. Wright,et al.  Duality-based algorithms for total-variation-regularized image restoration , 2010, Comput. Optim. Appl..

[52]  José Mario Martínez,et al.  Spectral Projected Gradient Methods: Review and Perspectives , 2014 .