Leading strategies in competitive on-line prediction

We start from a simple asymptotic result for the problem of on-line regression with the quadratic loss function: the class of continuous limited-memory prediction strategies admits a ''leading prediction strategy'', which not only asymptotically performs at least as well as any continuous limited-memory strategy, but also satisfies the property that the excess loss of any continuous limited-memory strategy is determined by how closely it imitates the leading strategy. More specifically, for any class of prediction strategies constituting a reproducing kernel Hilbert space, we construct a leading strategy, in the sense that the loss of any prediction strategy whose norm is not too large is determined by how closely it imitates the leading strategy. This result is extended to the loss functions given by Bregman divergences and by strictly proper scoring rules.

[1]  Vladimir Vovk Defensive Prediction with Expert Advice , 2005, ALT.

[2]  Manfred K. Warmuth,et al.  Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions , 1999, Machine Learning.

[3]  G. Shafer,et al.  Probability and Finance: It's Only a Game! , 2001 .

[4]  Akimichi Takemura,et al.  Defensive Forecasting for Linear Protocols , 2005, ALT.

[5]  Philip M. Long,et al.  Worst-case quadratic loss bounds for prediction using linear functions and gradient descent , 1996, IEEE Trans. Neural Networks.

[6]  C. Schnorr Zufälligkeit und Wahrscheinlichkeit , 1971 .

[7]  Manfred K. Warmuth,et al.  Relative Loss Bounds for Multidimensional Regression Problems , 1997, Machine Learning.

[8]  A. P. Dawid,et al.  Probability, Causality and the Empirical World: A Bayes-de Finetti-Popper-Borel Synthesis , 2004 .

[9]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[10]  Claudio Gentile,et al.  Adaptive and Self-Confident On-Line Learning Algorithms , 2000, J. Comput. Syst. Sci..

[11]  Vladimir Vovk,et al.  Competing with Stationary Prediction Strategies , 2006, COLT.

[12]  Mark Herbster,et al.  Tracking the Best Linear Predictor , 2001, J. Mach. Learn. Res..

[13]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[14]  M. Kendall Theoretical Statistics , 1956, Nature.

[15]  Manfred K. Warmuth,et al.  Relative loss bounds for single neurons , 1999, IEEE Trans. Neural Networks.

[16]  Vladimir Vovk,et al.  Competing with Markov prediction strategies , 2006, ArXiv.

[17]  Vladimir Vovk,et al.  Predictions as Statements and Decisions , 2006, COLT.

[18]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[19]  Jean-Luc Ville Étude critique de la notion de collectif , 1939 .

[20]  Vladimir Vovk Competing with Wild Prediction Rules , 2006, COLT.

[21]  S. Saitoh Integral Transforms, Reproducing Kernels and Their Applications , 1997 .

[22]  Vladimir Vovk,et al.  On-Line Regression Competitive with Reproducing Kernel Hilbert Spaces , 2005, TAMC.

[23]  Philip M. Long,et al.  WORST-CASE QUADRATIC LOSS BOUNDS FOR ON-LINE PREDICTION OF LINEAR FUNCTIONS BY GRADIENT DESCENT , 1993 .

[24]  Patrick Brézillon,et al.  Lecture Notes in Artificial Intelligence , 1999 .

[25]  Par N. Aronszajn La théorie des noyaux reproduisants et ses applications Première Partie , 1943, Mathematical Proceedings of the Cambridge Philosophical Society.

[26]  J. Ellul The Technological Bluff , 1990 .

[27]  Akimichi Takemura,et al.  Defensive Forecasting , 2005, AISTATS.

[28]  Vladimir Vovk Competitive on-line learning with a convex loss function , 2005, ArXiv.

[29]  A. P. Dawid,et al.  Present position and potential developments: some personal views , 1984 .

[30]  Vladimir Vovk Non-asymptotic calibration and resolution , 2007, Theor. Comput. Sci..

[31]  Don R. Hush,et al.  Function Classes That Approximate the Bayes Risk , 2006, COLT.

[32]  齋藤 三郎 Integral transforms, reproducing kernels and their applications , 1997 .

[33]  Per Martin-Löf,et al.  The Definition of Random Sequences , 1966, Inf. Control..

[34]  R Š Lipcer,et al.  ON THE QUESTION OF ABSOLUTE CONTINUITY AND SINGULARITY OF PROBABILITY MEASURES , 1977 .

[35]  A. Dawid Calibration-Based Empirical Probability , 1985 .

[36]  Vladimir Vovk,et al.  Competing with wild prediction rules , 2005, Machine Learning.

[37]  Vladimir Vovk Probability theory for the Brier game , 2001, Theor. Comput. Sci..

[38]  Ray J. Solomonoff,et al.  Complexity-based induction systems: Comparisons and convergence theorems , 1978, IEEE Trans. Inf. Theory.

[39]  Vladimir Vovk Leading Strategies in Competitive On-Line Prediction , 2006, ALT.

[40]  D. Blackwell,et al.  Merging of Opinions with Increasing Information , 1962 .

[41]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .