Tracking the Best of Many Experts

An algorithm is presented for online prediction that allows to track the best expert efficiently even if the number of experts is exponentially large, provided that the set of experts has a certain structure allowing efficient implementations of the exponentially weighted average predictor. As an example we work out the case where each expert is represented by a path in a directed graph and the loss of each expert is the sum of the weights over the edges in the path.

[1]  Tamás Linder,et al.  Efficient algorithms and minimax bounds for zero-delay lossy source coding , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..

[2]  Tamás Linder,et al.  Efficient adaptive algorithms and minimax bounds for zero-delay lossy source coding , 2004, IEEE Transactions on Signal Processing.

[3]  Manfred K. Warmuth,et al.  Tracking a Small Set of Experts by Mixing Past Posteriors , 2003, J. Mach. Learn. Res..

[4]  Manfred K. Warmuth,et al.  Predicting nearly as well as the best pruning of a planar decision graph , 2002, Theor. Comput. Sci..

[5]  David Haussler,et al.  How to use expert advice , 1993, STOC.

[6]  Mark Herbster,et al.  Tracking the Best Linear Predictor , 2001, J. Mach. Learn. Res..

[7]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[8]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine-mediated learning.

[9]  Mark Herbster,et al.  Tracking the best regressor , 1998, COLT' 98.

[10]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, Journal of computer and system sciences (Print).

[11]  D. Blackwell An analog of the minimax theorem for vector payoffs. , 1956 .

[12]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[13]  Robert E. Schapire,et al.  Predicting Nearly As Well As the Best Pruning of a Decision Tree , 1995, COLT '95.

[14]  J. Franckel,et al.  5. De la couleur des prépositions dans leurs emplois fonctionnels , 2006 .

[15]  Manfred K. Warmuth,et al.  Path Kernels and Multiplicative Updates , 2002, J. Mach. Learn. Res..

[16]  Bernhard Schölkopf,et al.  Learning Theory and Kernel Machines , 2003, Lecture Notes in Computer Science.

[17]  Vladimir Vovk,et al.  Derandomizing Stochastic Prediction Strategies , 1997, COLT '97.

[18]  Philip Wolfe,et al.  Contributions to the theory of games , 1953 .

[19]  Yoram Singer,et al.  An efficient extension to mixture techniques for prediction and decision trees , 1997, COLT '97.