No-Regret Learnability for Piecewise Linear Losses

In the convex optimization approach to online regret minimization, many methods have been developed to guarantee a $O(\sqrt{T})$ bound on regret for subdifferentiable convex loss functions with bounded subgradients, by using a reduction to linear loss functions. This suggests that linear loss functions tend to be the hardest ones to learn against, regardless of the underlying decision spaces. We investigate this question in a systematic fashion looking at the interplay between the set of possible moves for both the decision maker and the adversarial environment. This allows us to highlight sharp distinctive behaviors about the learnability of piecewise linear loss functions. On the one hand, when the decision set of the decision maker is a polyhedron, we establish $\Omega(\sqrt{T})$ lower bounds on regret for a large class of piecewise linear loss functions with important applications in online linear optimization, repeated zero-sum Stackelberg games, online prediction with side information, and online two-stage optimization. On the other hand, we exhibit $o(\sqrt{T})$ learning rates, achieved by the Follow-The-Leader algorithm, in online linear optimization when the boundary of the decision maker's decision set is curved and when $0$ does not lie in the convex hull of the environment's decision set. Hence, the curvature of the decision maker's decision set is a determining factor for the optimal learning rate. These results hold in a completely adversarial setting.

[1]  Vincent Conitzer,et al.  Security scheduling for real-world networks , 2013, AAMAS.

[2]  Erik Ordentlich,et al.  The Cost of Achieving the Best Portfolio in Hindsight , 1998, Math. Oper. Res..

[3]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[4]  G. Lugosi,et al.  On Prediction of Individual Sequences , 1998 .

[5]  Georgios B. Giannakis,et al.  Online optimal power flow with renewables , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[6]  Ambuj Tewari,et al.  Online Learning: Random Averages, Combinatorial Parameters, and Learnability , 2010, NIPS.

[7]  Peter L. Bartlett,et al.  A Stochastic View of Optimal Regret through Minimax Duality , 2009, COLT.

[8]  Vincenzo Bonifaci,et al.  Stackelberg Routing in Arbitrary Networks , 2008, Math. Oper. Res..

[9]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[10]  Karthik Sridharan,et al.  Optimization, Learning, and Games with Predictable Sequences , 2013, NIPS.

[11]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..

[12]  Adam Tauman Kalai,et al.  Logarithmic Regret Algorithms for Online Convex Optimization , 2006, COLT.

[13]  Ambuj Tewari,et al.  Optimal Stragies and Minimax Lower Bounds for Online Convex Games , 2008, COLT.

[14]  Y. Freund,et al.  Adaptive game playing using multiplicative weights , 1999 .

[15]  Maria-Florina Balcan,et al.  Commitment Without Regrets: Online Learning in Stackelberg Security Games , 2015, EC.

[16]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[17]  Elad Hazan,et al.  Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[18]  Milind Tambe,et al.  Towards Optimal Patrol Strategies for Fare Inspection in Transit Systems , 2012, AAAI Spring Symposium: Game Theory for Security, Sustainability, and Health.

[19]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, Journal of computer and system sciences (Print).

[20]  Ambuj Tewari,et al.  Online learning via sequential complexities , 2010, J. Mach. Learn. Res..

[21]  G. Lugosi,et al.  On Prediction of Individual Sequences , 1998 .

[22]  Peter L. Bartlett,et al.  Adaptive Online Gradient Descent , 2007, NIPS.

[23]  Yinyu Ye,et al.  Convergence behavior of interior-point algorithms , 1993, Math. Program..

[24]  David Haussler,et al.  Sequential Prediction of Individual Sequences Under General Loss Functions , 1998, IEEE Trans. Inf. Theory.

[25]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[26]  G. Pisier Martingales with values in uniformly convex spaces , 1975 .

[27]  Vladimir Vovk,et al.  Competitive On-line Linear Regression , 1997, NIPS.

[28]  Ambuj Tewari,et al.  Online Learning: Beyond Regret , 2010, COLT.

[29]  B. Jansen,et al.  Sensitivity Analysis in (Degenerate) Quadratic Programming , 1996 .

[30]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine Learning.