Online Learning with Costly Features and Labels

This paper introduces the online probing problem: In each round, the learner is able to purchase the values of a subset of feature values. After the learner uses this information to come up with a prediction for the given round, he then has the option of paying to see the loss function that he is evaluated against. Either way, the learner pays for both the errors of his predictions and also whatever he chooses to observe, including the cost of observing the loss function for the given round and the cost of the observed features. We consider two variations of this problem, depending on whether the learner can observe the label for free or not. We provide algorithms and upper and lower bounds on the regret for both variants. We show that a positive cost for observing the label significantly increases the regret of the problem.

[1]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[2]  Russell Greiner,et al.  Budgeted learning of nailve-bayes classifiers , 2002, UAI 2002.

[3]  E. Ordentlich,et al.  On delayed prediction of individual sequences , 2002, Proceedings IEEE International Symposium on Information Theory,.

[4]  Russell Greiner,et al.  Budgeted Learning of Naive-Bayes Classifiers , 2003, UAI.

[5]  Gábor Lugosi,et al.  Minimizing Regret with Label Efficient Prediction , 2004, COLT.

[6]  Chris Mesterharm,et al.  On-line Learning with Delayed Label Feedback , 2005, ALT.

[7]  Russell Greiner,et al.  Learning and Classifying Under Hard Budgets , 2005, ECML.

[8]  Nicolò Cesa-Bianchi,et al.  Regret Minimization Under Partial Monitoring , 2006, ITW.

[9]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[10]  Ohad Shamir,et al.  Learning to classify with missing and corrupted features , 2008, ICML.

[11]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[12]  Ohad Shamir,et al.  Efficient Learning with Partially Observed Attributes , 2010, ICML.

[13]  Shie Mannor,et al.  From Bandits to Experts: On the Value of Side-Observations , 2011, NIPS.

[14]  Peter L. Bartlett,et al.  Learning with Missing Features , 2011, UAI.

[15]  John C. Duchi,et al.  Distributed delayed stochastic optimization , 2011, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[16]  Gábor Bartók,et al.  The Role of Information in Online Learning , 2012 .

[17]  András György,et al.  Online Learning under Delayed Feedback , 2013, ICML.