Online Learning and Online Convex Optimization

Online learning is a well established learning paradigm which has both theoretical and practical appeals. The goal of online learning is to make a sequence of accurate predictions given knowledge of the correct answer to previous prediction tasks and possibly additional available information. Online learning has been studied in several research fields including game theory, information theory, and machine learning. It also became of great interest to practitioners due the recent emergence of large scale applications such as online advertisement placement and online web ranking. In this survey we provide a modern overview of online learning. Our goal is to give the reader a sense of some of the interesting ideas and in particular to underscore the centrality of convexity in deriving efficient online learning algorithms. We do not mean to be comprehensive but rather to give a high-level, rigorous yet easy to follow, survey.

[1]  I. J. Schoenberg,et al.  The Relaxation Method for Linear Inequalities , 1954, Canadian Journal of Mathematics.

[2]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[3]  Thomas M. Cover,et al.  Behavior of sequential predictors of binary sequences , 1965 .

[4]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .

[5]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[6]  Nick Littlestone,et al.  From on-line to batch learning , 1989, COLT '89.

[7]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[8]  N. Littlestone Mistake bounds and logarithmic linear-threshold learning algorithms , 1990 .

[9]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[10]  Manfred K. Warmuth,et al.  Additive versus exponentiated gradient updates for linear prediction , 1995, STOC '95.

[11]  Manfred K. Warmuth,et al.  On Weak Learning , 1995, J. Comput. Syst. Sci..

[12]  Avrim Blum,et al.  On-line Algorithms in Machine Learning , 1996, Online Algorithms.

[13]  Dale Schuurmans,et al.  General Convergence Results for Linear Discriminant Updates , 1997, COLT '97.

[14]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[15]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[16]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[17]  Claudio Gentile,et al.  The Robustness of the p-Norm Algorithms , 1999, COLT '99.

[18]  Geoffrey J. Gordon Regret bounds for prediction problems , 1999, COLT '99.

[19]  Adrian S. Lewis,et al.  Convex Analysis And Nonlinear Optimization , 2000 .

[20]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[21]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[22]  C. Zălinescu Convex analysis in general vector spaces , 2002 .

[23]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[24]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[25]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[26]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..

[27]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[28]  Peter Auer,et al.  Tracking the Best Disjunction , 1998, Machine Learning.

[29]  Manfred K. Warmuth,et al.  Relative Loss Bounds for Multidimensional Regression Problems , 1997, Machine Learning.

[30]  Tim Hesterberg,et al.  Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control , 2004, Technometrics.

[31]  Manfred K. Warmuth,et al.  Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions , 1999, Machine Learning.

[32]  Claudio Gentile,et al.  On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.

[33]  Adam Tauman Kalai,et al.  Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.

[34]  Tong Zhang Data Dependent Concentration Bounds for Sequential Prediction Algorithms , 2005, COLT.

[35]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[36]  Yoram Singer,et al.  A primal-dual perspective of online learning algorithms , 2007, Machine Learning.

[37]  James C. Spall,et al.  Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .

[38]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[39]  Thomas P. Hayes,et al.  The Price of Bandit Information for Online Optimization , 2007, NIPS.

[40]  Shai Shalev-Shwartz,et al.  Online learning: theory, algorithms and applications (למידה מקוונת.) , 2007 .

[41]  Ambuj Tewari,et al.  Optimal Stragies and Minimax Lower Bounds for Online Convex Games , 2008, COLT.

[42]  Claudio Gentile,et al.  Improved Risk Tail Bounds for On-Line Algorithms , 2005, IEEE Transactions on Information Theory.

[43]  Elad Hazan,et al.  Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization , 2008, COLT.

[44]  Shai Ben-David,et al.  Agnostic Online Learning , 2009, COLT.

[45]  Alexander Rakhlin,et al.  Lecture Notes on Online Learning DRAFT , 2009 .

[46]  Ambuj Tewari,et al.  Online Learning: Random Averages, Combinatorial Parameters, and Learnability , 2010, NIPS.

[47]  Elad Hazan The convex optimization approach to regret minimization , 2011 .

[48]  Ambuj Tewari,et al.  Regularization Techniques for Learning with Matrices , 2009, J. Mach. Learn. Res..