论文信息 - Online Learning with a Hint - 字舞流文

Online Learning with a Hint

We study a variant of online linear optimization where the player receives a hint about the loss function at the beginning of each round. The hint is given in the form of a vector that is weakly correlated with the loss vector on that round. We show that the player can benefit from such a hint if the set of feasible actions is sufficiently round. Specifically, if the set is strongly convex, the hint can be used to guarantee a regret of O(log(T)), and if the set is q-uniformly convex for q\in(2,3), the hint can be used to guarantee a regret of o(sqrt{T}). In contrast, we establish Omega(sqrt{T}) lower bounds on regret when the set of feasible actions is a polyhedron.

Patrick Jaillet | Ofer Dekel | Nika Haghtalab | Arthur Flajolet | O. Dekel | Nika Haghtalab | Patrick Jaillet | Arthur Flajolet

[1] Karthik Sridharan,et al. Optimization, Learning, and Games with Predictable Sequences , 2013, NIPS.

[2] Tor Lattimore,et al. Following the Leader and Fast Rates in Online Linear Prediction: Curved Constraint Sets and Other Regularities , 2017, J. Mach. Learn. Res..

[3] Yoav Freund,et al. Boosting a weak learning algorithm by majority , 1990, COLT '90.

[4] Karthik Sridharan,et al. Online Learning with Predictable Sequences , 2012, COLT.

[5] Santosh S. Vempala,et al. Efficient algorithms for online decision problems , 2005, Journal of computer and system sciences (Print).

[6] G. Pisier. Martingales in Banach Spaces , 2016 .

[7] Y. Mansour,et al. 4 Learning , Regret minimization , and Equilibria , 2006 .

[8] H. Brendan McMahan,et al. A survey of Algorithms and Analysis for Adaptive Online Learning , 2014, J. Mach. Learn. Res..

[9] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[10] Tor Lattimore,et al. Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities , 2016, NIPS.

[11] P. Bartlett,et al. Optimal strategies and minimax lower bounds for online convex games [Technical Report No. UCB/EECS-2008-19] , 2008 .

[12] Elad Hazan,et al. Extracting certainty from uncertainty: regret bounded by variation in costs , 2008, Machine Learning.

[13] Nimrod Megiddo,et al. Online Learning with Prior Knowledge , 2007, COLT.

[14] Roman Vershynin,et al. Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[15] Rong Jin,et al. 25th Annual Conference on Learning Theory Online Optimization with Gradual Variations , 2022 .

[16] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[17] Patrick Jaillet,et al. No-Regret Learnability for Piecewise Linear Losses , 2014, ArXiv.

[18] K. Ball,et al. Sharp uniform convexity and smoothness inequalities for trace norms , 1994 .

[19] Elad Hazan. The convex optimization approach to regret minimization , 2011 .

[20] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[21] Chi-Jen Lu,et al. Online learning with queries , 2010, SODA '10.

[22] Sanjeev Arora,et al. The Multiplicative Weights Update Method: a Meta-Algorithm and Applications , 2012, Theory Comput..

[23] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[24] Vladimir Vovk,et al. Competing with wild prediction rules , 2005, Machine Learning.