Competitive Online Generalized Linear Regression under Square Loss

We apply the Aggregating Algorithm to the problem of online regression under the square loss function. We develop an algorithm competitive with the benchmark class of generalized linear models (our "experts"), which are used in a wide range of practical tasks. This problem does not appear to be analytically tractable. Therefore, we develop a prediction algorithm using the Markov chain Monte Carlo method, which is shown to be fast and reliable in many cases. We prove upper bounds on the cumulative square loss of the algorithm. We also perform experiments with our algorithm on a toy data set and two real world ozone level data sets and give suggestions about choosing its parameters.

[1]  S. Walker Invited comment on the paper "Slice Sampling" by Radford Neal , 2003 .

[2]  D. Harville Matrix Algebra From a Statistician's Perspective , 1998 .

[3]  Elad Hazan,et al.  Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[4]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[5]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[6]  V. Vovk Competitive On‐line Statistics , 2001 .

[7]  Vladimir Vovk,et al.  Derandomizing Stochastic Prediction Strategies , 1997, COLT '97.

[8]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[9]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[10]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[11]  Jürgen Forster,et al.  On Relative Loss Bounds in Generalized Linear Regression , 1999, FCT.

[12]  Arnak S. Dalalyan,et al.  Sparse Regression Learning by Aggregation and Langevin Monte-Carlo , 2009, COLT.

[13]  Robin Milner An Action Structure for Synchronous pi-Calculus , 1993, FCT.

[14]  Manfred K. Warmuth,et al.  Relative Loss Bounds for Multidimensional Regression Problems , 1997, Machine Learning.

[15]  Gavin C. Cawley,et al.  Generalised Kernel Machines , 2007, 2007 International Joint Conference on Neural Networks.

[16]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[17]  Kun Zhang,et al.  Forecasting skewed biased stochastic ozone days: analyses, solutions and beyond , 2008, Knowledge and Information Systems.

[18]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[19]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine Learning.

[20]  Arindam Banerjee,et al.  An Analysis of Logistic Models: Exponential Family Connections and Online Performance , 2007, SDM.

[21]  Radford M. Neal Slice Sampling , 2003, The Annals of Statistics.

[22]  Radford M. Neal Regression and Classification Using Gaussian Process Priors , 2009 .

[23]  Sham M. Kakade,et al.  Online Bounds for Bayesian Algorithms , 2004, NIPS.