In an online linear optimization problem, on each period t, an online algorithm chooses st ∈ S from a fixed (possibly infinite) setS of feasible decisions. Nature (who may be adversarial) choo ses a weight vector wt ∈ R, and the algorithm incurs cost c(st, wt), wherec is a fixed cost function that is linear in the weight vector. In thefull-information setting, the vector wt is then revealed to the algorithm, and in the banditsetting, only the cost experienced, c(st, wt), is revealed. The goal of the online algorithm is to perform ne arly as well as the best fixed s ∈ S in hindsight. Many repeated decision-making problems with weights fit natural ly into this framework, such as online shortest-path, onlin e TSP, online clustering, and online weighted set cover. Previously, it was shown how to convert any efficient xactoffline optimization algorithm for such a problem into an efficient online bandit algorithm in both the full-informat ion and the bandit settings, with average cost nearly as good as that of the best fixed s ∈ S in hindsight. However, in the case where the offline algorith m is an approximation algorithm with ratioα > 1, the previous approach only worked for special types of appr oximation algorithms. We show how to convert any efficient offline α-approximation algorithm for a linear optimization proble m into an efficient algorithm for the corresponding online problem, w ith average cost not much larger than α times that of the bests ∈ S, in both the full-information and the bandit settings. Our m ain innovation is in the full-information setting: we combine Zinkevich’s algorithm for convex optimization w ith a geometric transformation that can be applied to any approximation algorithm. In the bandit setting, standard t echniques apply, except that a “Barycentric Spanner” for th e problem is also (provably) necessary as input. Our algorithm can also be viewed as a method for playing a larg e repeated games, where one can only compute approximatebest-responses, rather than best-responses. TTI-C. sham@tti-c.org Georgia Tech. atk@cc.gatech.edu Carnegie Mellon. katrina@cs.cmu.edu
[1]
Aranyak Mehta,et al.
Design is as Easy as Optimization
,
2006,
SIAM J. Discret. Math..
[2]
Baruch Awerbuch,et al.
Adaptive routing with end-to-end feedback: distributed learning and geometric approaches
,
2004,
STOC '04.
[3]
Santosh S. Vempala,et al.
Efficient algorithms for online decision problems
,
2005,
J. Comput. Syst. Sci..
[4]
Maria-Florina Balcan,et al.
Approximation algorithms and online mechanisms for item pricing
,
2006,
EC '06.
[5]
Martin Zinkevich,et al.
Online Convex Programming and Generalized Infinitesimal Gradient Ascent
,
2003,
ICML.
[6]
David P. Williamson,et al.
Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming
,
1995,
JACM.
[7]
James Hannan,et al.
4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY
,
1958
.
[8]
Avrim Blum,et al.
Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary
,
2004,
COLT.
[9]
Thomas P. Hayes,et al.
Robbing the bandit: less regret in online geometric optimization against an adaptive adversary
,
2006,
SODA '06.
[10]
Robert D. Carr,et al.
Randomized metarounding
,
2002,
Random Struct. Algorithms.
[11]
Santosh S. Vempala,et al.
Efficient algorithms for online decision problems
,
2005,
Journal of computer and system sciences (Print).
[12]
H. Robbins.
Some aspects of the sequential design of experiments
,
1952
.