论文信息 - Gradient convergence in gradient methods

Gradient convergence in gradient methods

For the classical gradient method Xt+l = xt - -ytVf(xt) and several deterministic and stochastic variants, we discuss the issue of convergence of the gradient sequence Vf(xt) and the attendant issue of stationarity of limit points of xt. W;"e assume that Vf is Lipschitz continuous, and that the stepsize at diminishes to 0 and satisfies standard stochastic approximation conditions. We show that either f(xt) - -oo or else f(xt) converges to a finite value and Vf(.t) -- 0 (with probability 1 in the stochastic case). Existing results assume various boundedness conditions such as boundedness from below of f, or boundedness of Vf(xt), or boundedness of Xt.

D. Bertsekas

[1] V. Fabian. STOCHASTIC APPROXIMATION METHODS , 1960 .

[2] Lennart Ljung,et al. Analysis of recursive stochastic algorithms , 1977 .

[3] Harold J. Kushner,et al. wchastic. approximation methods for constrained and unconstrained systems , 1978 .

[4] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .

[5] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[6] Zhi-Quan Luo,et al. On the Convergence of the LMS Algorithm with Adaptive Learning Rate for Linear Feedforward Networks , 1991, Neural Computation.

[7] Luo Zhi-quan,et al. Analysis of an approximate gradient projection method with applications to the backpropagation algorithm , 1994 .

[8] Alexei A. Gaivoronski,et al. Convergence properties of backpropagation for neural nets via theory of stochastic gradient methods. Part 1 , 1994 .

[9] Luigi Grippo,et al. A class of unconstrained minimization methods for neural network training , 1994 .

[10] D. Bertsekas,et al. A hybrid incremental gradient method for least squares problems , 1994 .

[11] O. Mangasarian,et al. Serial and parallel backpropagation convergence via nonmonotone perturbed minimization , 1994 .

[12] Dimitri P. Bertsekas,et al. Nonlinear Programming , 1997 .

[13] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[14] O. Nelles,et al. An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[15] B. Delyon. General results on the convergence of stochastic algorithms , 1996, IEEE Trans. Autom. Control..

[16] V. Borkar. Asynchronous Stochastic Approximations , 1998 .