Convergence Rate Analysis of a Stochastic Trust Region Method for Nonconvex Optimization

We introduce a variant of a traditional trust region method which is aimed at stochastic optimization. While traditional trust region method relies on exact computations of the gradient and values of the objective function, our method assumes that these values are available up to some dynamically adjusted accuracy. Moreover, this accuracy is assumed to hold only with some sufficiently large, but fixed, probability, without any additional restrictions on the variance of the errors. We show that our assumptions apply to the standard stochastic setting assumed in the machine learning problems, but also include more general settings. We then proceed to provide a bound on the expected number of iterations the stochastic algorithm requires to reach accuracy $\|\nabla f(x)\|\leq \epsilon$, for any $\epsilon>0$. The resulting bound is $O(\epsilon^{-2})$, under the assumption of sufficiently accurate stochastic gradient.