A unified modular analysis of online and stochastic optimization: adaptivity, optimism, non-convexity

We present a simple unified analysis of adaptive Mirror Descent (MD) and Followthe-Regularized-Leader (FTRL) algorithms for online and stochastic optimization in (possibly infinite-dimensional) Hilbert spaces. The analysis is modular in the sense that it completely decouples the effect of possible assumptions on the loss functions (such as smoothness, strong convexity, and non-convexity) and on the optimization regularizers (such as strong convexity, non-smooth penalties in composite-objective learning, and non-monotone step-size sequences). We demonstrate the power of this decoupling by obtaining generalized algorithms and improved regret bounds for the so-called “adaptive optimistic online learning” setting. In addition, we simplify and extend a large body of previous work, including several various AdaGrad formulations, composite-objective and implicit-update algorithms. In all cases, the results follow as simple corollaries within few lines of algebra. Finally, the decomposition enables us to obtain preliminary global guarantees for limited classes of non-convex problems.