The Limited Memory Conjugate Gradient Method

In theory, the successive gradients generated by the conjugate gradient method applied to a quadratic should be orthogonal. However, for some ill-conditioned problems, orthogonality is quickly lost due to rounding errors, and convergence is much slower than expected. A limited memory version of the nonlinear conjugate gradient method is developed. The memory is used to both detect the loss of orthogonality and to restore orthogonality. An implementation of the algorithm is presented based on the CG_DESCENT nonlinear conjugate gradient method. Limited memory CG_DESCENT (L-CG_DESCENT) possesses a global convergence property similar to that of the memoryless algorithm but has much better practical performance. Numerical comparisons to the limited memory BFGS method (L-BFGS) are given using the CUTEr test problems.

[1]  J. Borwein,et al.  Two-Point Step Size Gradient Methods , 1988 .

[2]  J. Nocedal Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[3]  Philip E. Gill,et al.  Reduced-Hessian Quasi-Newton Methods for Unconstrained Optimization , 2001, SIAM J. Optim..

[4]  G. Golub,et al.  Numerical techniques in mathematical programming , 1970 .

[5]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[6]  Shmuel S. Oren,et al.  Self-scaling variable metric algorithms for unconstrained minimization , 1972 .

[7]  David J. Thuente,et al.  Line search algorithms with guaranteed sufficient decrease , 1994, TOMS.

[8]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[9]  Nicholas I. M. Gould,et al.  CUTE: constrained and unconstrained testing environment , 1995, TOMS.

[10]  Ya-Xiang Yuan,et al.  An Efficient Hybrid Conjugate Gradient Method for Unconstrained Optimization , 2001, Ann. Oper. Res..

[11]  P. Wolfe Convergence Conditions for Ascent Methods. II , 1969 .

[12]  Jorge Nocedal,et al.  Global Convergence Properties of Conjugate Gradient Methods for Optimization , 1992, SIAM J. Optim..

[13]  Philipp Birken,et al.  Numerical Linear Algebra , 2011, Encyclopedia of Parallel Computing.

[14]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[15]  William W. Hager,et al.  Algorithm 851: CG_DESCENT, a conjugate gradient method with guaranteed descent , 2006, TOMS.

[16]  William W. Hager,et al.  A New Conjugate Gradient Method with Guaranteed Descent and an Efficient Line Search , 2005, SIAM J. Optim..

[17]  Dirk Siegel,et al.  Modifying the BFGS update by a new column scaling technique , 1994, Math. Program..

[18]  Yu-Hong Dai,et al.  A Nonlinear Conjugate Gradient Algorithm with an Optimal Property and an Improved Wolfe Line Search , 2013, SIAM J. Optim..

[19]  Jorge J. Moré,et al.  Digital Object Identifier (DOI) 10.1007/s101070100263 , 2001 .

[20]  M. Powell Convergence properties of algorithms for nonlinear optimization , 1986 .

[21]  Ya-Xiang Yuan,et al.  A Nonlinear Conjugate Gradient Method with a Strong Global Convergence Property , 1999, SIAM J. Optim..

[22]  W. Hager,et al.  A SURVEY OF NONLINEAR CONJUGATE GRADIENT METHODS , 2005 .

[23]  A. Perry A Class of Conjugate Gradient Algorithms with a Two-Step Variable Metric Memory , 1977 .

[24]  C. M. Reeves,et al.  Function minimization by conjugate gradients , 1964, Comput. J..

[25]  Philip E. Gill,et al.  Limited-Memory Reduced-Hessian Methods for Large-Scale Unconstrained Optimization , 2003, SIAM J. Optim..

[26]  P. Wolfe Convergence Conditions for Ascent Methods. II: Some Corrections , 1971 .

[27]  D. Shanno On the Convergence of a New Conjugate Gradient Algorithm , 1978 .