Beyond Banditron: A Conservative and Efficient Reduction for Online Multiclass Prediction with Bandit Setting Model

In this paper, we consider a recently proposed supervised learning problem, called online multiclass prediction with bandit setting model. Aiming at learning from partial feedback of online classification results, i.e. “true” when the predicting label is right or “false” when the predicting label is wrong, this new kind of problems arouses much of researchers’ interest due to its close relations to real world internet applications and human cognitive procedure. While some algorithms have been brought forward, we propose a novel algorithm to deal with such problems. First, we reduce the multiclass prediction problem to binary based on Conservative one-versus-all others Reduction scheme; Then Online Passive-Aggressive Algorithm is embedded as binary learning algorithm to solve the reduced problem. Also we derive a pleasing cumulative mistake bound for our algorithm and a time complexity bound linear to the sample size. Further experimental evaluation on several real world multiclass datasets including RCV1, MNIST, 20 Newsgroups and USPS shows that our method outperforms the existing algorithms with a great improvement.

[1]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2006, IEEE Transactions on Knowledge and Data Engineering.

[2]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[3]  John Langford,et al.  The offset tree for learning with partial labels , 2008, KDD.

[4]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[5]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[6]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[7]  J. Langford,et al.  The Epoch-Greedy algorithm for contextual multi-armed bandits , 2007, NIPS 2007.

[8]  David G. Stork,et al.  Pattern Classification , 1973 .

[9]  Koby Crammer,et al.  Ultraconservative Online Algorithms for Multiclass Problems , 2001, J. Mach. Learn. Res..

[10]  Ambuj Tewari,et al.  Efficient bandit algorithms for online multiclass prediction , 2008, ICML '08.

[11]  Yoram Singer,et al.  Online multiclass learning by interclass hypothesis sharing , 2006, ICML.

[12]  John Langford,et al.  Cost-sensitive learning by cost-proportionate example weighting , 2003, Third IEEE International Conference on Data Mining.

[13]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[14]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[15]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[16]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[17]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[18]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[19]  Nicolò Cesa-Bianchi,et al.  Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.