论文信息 - Hybrid Stochastic-Adversarial On-line Learning

Hybrid Stochastic-Adversarial On-line Learning

Most of the research in online learning focused either on the problem of adversarial classification (i.e., both inputs and labels are arbitrarily chosen by an adversary) or on the traditional supervised learning problem in which samples are independently generated from a fixed probability distribution. Nonetheless, in a number of domains the relationship between inputs and labels may be adversarial, whereas input instances are generated according to a constant distribution. This scenario can be formalized as an hybrid classification problem in which inputs are stochastic, while labels are adversarial. In this paper, we introduce this hybrid stochastic-adversarial classification problem, we propose an online learning algorithm for its solution, and we analyze its performance. In particular, we show that, given a hypothesis space H with finite VC dimension, it is possible to incrementally build a suitable finite set of hypotheses that can be used as input for an exponentially weighted forecaster achieving a cumulative regret of order O( p nV C(H) log n) with overwhelming probability. Finally, we discuss extensions to multi-label classification, learning from experts and bandit settings with stochastic side information, and application to games.

R. Munos | A. Lazaric

[1] Ambuj Tewari,et al. Efficient bandit algorithms for online multiclass prediction , 2008, ICML '08.

[2] J. Langford,et al. The Epoch-Greedy algorithm for contextual multi-armed bandits , 2007, NIPS 2007.

[3] Gábor Lugosi,et al. Learning correlated equilibria in games with compact sets of strategies , 2007, Games Econ. Behav..

[4] Yoram Singer,et al. Online multiclass learning by interclass hypothesis sharing , 2006, ICML.

[5] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[6] Adam Tauman Kalai,et al. From Batch to Transductive Online Learning , 2005, NIPS.

[7] Daniil Ryabko,et al. Pattern Recognition for Conditionally Independent Data , 2005, J. Mach. Learn. Res..

[8] Gábor Lugosi,et al. Introduction to Statistical Learning Theory , 2004, Advanced Lectures on Machine Learning.

[9] Koby Crammer,et al. Ultraconservative Online Algorithms for Multiclass Problems , 2001, J. Mach. Learn. Res..

[10] Vladimir Vovk,et al. A game of prediction with expert advice , 1995, COLT '95.

[11] Philip M. Long,et al. Characterizations of Learnability for Classes of {0, ..., n}-Valued Functions , 1995, J. Comput. Syst. Sci..

[12] David Haussler,et al. How to use expert advice , 1993, STOC.

[13] Manfred K. Warmuth,et al. The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[14] N. Littlestone. Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[15] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[16] Shai Shalev-Shwartz,et al. Agnostic Online learnability , 2008 .

[17] Balas K. Natarajan,et al. On learning sets and functions , 2004, Machine Learning.

[18] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[19] Jason Weston,et al. Support vector machines for multi-class pattern recognition , 1999, ESANN.

[20] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.