Solving Combinatorial Games using Products, Projections and Lexicographically Optimal Bases

In order to find Nash-equilibria for two-player zero-sum games where each player plays combinatorial objects like spanning trees, matchings etc, we consider two online learning algorithms: the online mirror descent (OMD) algorithm and the multiplicative weights update (MWU) algorithm. The OMD algorithm requires the computation of a certain Bregman projection, that has closed form solutions for simple convex sets like the Euclidean ball or the simplex. However, for general polyhedra one often needs to exploit the general machinery of convex optimization. We give a novel primal-style algorithm for computing Bregman projections on the base polytopes of polymatroids. Next, in the case of the MWU algorithm, although it scales logarithmically in the number of pure strategies or experts $N$ in terms of regret, the algorithm takes time polynomial in $N$; this especially becomes a problem when learning combinatorial objects. We give a general recipe to simulate the multiplicative weights update algorithm in time polynomial in their natural dimension. This is useful whenever there exists a polynomial time generalized counting oracle (even if approximate) over these objects. Finally, using the combinatorial structure of symmetric Nash-equilibria (SNE) when both players play bases of matroids, we show that these can be found with a single projection or convex minimization (without using online learning).

[1]  Tamir Hazan,et al.  Following the Perturbed Leader for Online Structured Learning , 2015, ICML.

[2]  I. Althöfer On sparse approximations to randomized strategies and convex combinations , 1994 .

[3]  Mohammad Taghi Hajiaghayi,et al.  Regret minimization and the price of total anarchy , 2008, STOC.

[4]  Dominic Welsh,et al.  Some Problems on Approximate Counting in Graphs and Matroids , 2008, Bonn Workshop of Combinatorial Optimization.

[5]  Adam Tauman Kalai,et al.  Dueling algorithms , 2011, STOC '11.

[6]  Wouter M. Koolen,et al.  Second-order Quantile Methods for Experts and Combinatorial Games , 2015, COLT.

[7]  Sébastien Bubeck,et al.  Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[8]  Eric Vigoda,et al.  A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries , 2004, JACM.

[9]  Xiaotie Deng,et al.  Settling the complexity of computing two-player Nash equilibria , 2007, JACM.

[10]  Nir Ailon,et al.  Improved Bounds for Online Learning Over the Permutahedron and Other Ranking Polytopes , 2014, AISTATS.

[11]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..

[12]  Thomas Rothvoß,et al.  The matching polytope has exponential extension complexity , 2013, STOC.

[13]  Gergely Neu,et al.  Importance Weighting Without Importance Weights: An Efficient Algorithm for Combinatorial Semi-Bandits , 2015, J. Mach. Learn. Res..

[14]  Alexander Rakhlin,et al.  Lecture Notes on Online Learning DRAFT , 2009 .

[15]  Aranyak Mehta,et al.  Playing large games using simple strategies , 2003, EC '03.

[16]  Mohit Singh,et al.  Entropy, optimization and counting , 2013, STOC.

[17]  Karthik Sridharan,et al.  Optimization, Learning, and Games with Predictable Sequences , 2013, NIPS.

[18]  Eric Vigoda,et al.  A polynomial-time approximation algorithm for the permanent of a matrix with non-negative entries , 2001, STOC '01.

[19]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[20]  Arkadi Nemirovski,et al.  Lectures on modern convex optimization - analysis, algorithms, and engineering applications , 2001, MPS-SIAM series on optimization.

[21]  Elad Hazan,et al.  The computational power of optimization in online learning , 2015, STOC.

[22]  Kiyohito Nagano,et al.  On Convex Minimization over Base Polytopes , 2007, IPCO.

[23]  Vidyadhar G. Kulkarni,et al.  Generating Random Combinatorial Objects , 1990, J. Algorithms.

[24]  Richard J. Lipton,et al.  Simple strategies for large zero-sum games with applications to complexity theory , 1994, STOC '94.

[25]  Jack Edmonds,et al.  Matroids and the greedy algorithm , 1971, Math. Program..

[26]  J. G. Pierce,et al.  Geometric Algorithms and Combinatorial Optimization , 2016 .

[27]  Sébastien Bubeck,et al.  Introduction to Online Optimization , 2011 .

[28]  Alan M. Frieze,et al.  On the Problem of Approximating the Number of Bases of a Matroid , 1994, Inf. Process. Lett..

[29]  Jack Edmonds,et al.  Submodular Functions, Matroids, and Certain Polyhedra , 2001, Combinatorial Optimization.

[30]  Satoru Fujishige,et al.  Lexicographically Optimal Base of a Polymatroid with Respect to a Weight Vector , 1980, Math. Oper. Res..

[31]  Robert E. Schapire,et al.  Predicting Nearly As Well As the Best Pruning of a Decision Tree , 1995, COLT '95.

[32]  Manfred K. Warmuth,et al.  Randomized Online PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension , 2008 .

[33]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[34]  J. Neumann Zur Theorie der Gesellschaftsspiele , 1928 .

[35]  Alan Washburn,et al.  Two-Person Zero-Sum Games for Network Interdiction , 1995, Oper. Res..

[36]  Manfred K. Warmuth,et al.  Path Kernels and Multiplicative Updates , 2002, J. Mach. Learn. Res..

[37]  Amin Saberi,et al.  An O(log n/ log log n)-approximation algorithm for the asymmetric traveling salesman problem , 2010, SODA '10.

[38]  Sanjeev Arora,et al.  The Multiplicative Weights Update Method: a Meta-Algorithm and Applications , 2012, Theory Comput..

[39]  John Darzentas,et al.  Problem Complexity and Method Efficiency in Optimization , 1983 .

[40]  Sébastien Bubeck,et al.  Theory of Convex Optimization for Machine Learning , 2014, ArXiv.

[41]  William H. Cunningham,et al.  Minimum cuts, modular functions, and matroid polyhedra , 1985, Networks.

[42]  H. Groenevelt Two algorithms for maximizing a separable concave function over a polymatroid feasible region , 1991 .

[43]  Marc Teboulle,et al.  Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..

[44]  Manfred K. Warmuth,et al.  Learning Permutations with Exponential Weights , 2007, COLT.

[45]  Gary L. Miller,et al.  Approaching Optimality for Solving SDD Linear Systems , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[46]  Christos H. Papadimitriou,et al.  Computing correlated equilibria in multi-player games , 2005, STOC '05.

[47]  R. Kipp Martin,et al.  Using separation algorithms to generate mixed integer model reformulations , 1991, Oper. Res. Lett..

[48]  Martin Grötschel,et al.  The ellipsoid method and its consequences in combinatorial optimization , 1981, Comb..

[49]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[50]  Gábor Lugosi,et al.  Regret in Online Combinatorial Optimization , 2012, Math. Oper. Res..

[51]  James B. Orlin,et al.  Max flows in O(nm) time, or better , 2013, STOC '13.

[52]  Mark Jerrum,et al.  Approximate Counting, Uniform Generation and Rapidly Mixing Markov Chains , 1987, WG.

[53]  Leslie G. Valiant,et al.  Random Generation of Combinatorial Structures from a Uniform Distribution , 1986, Theor. Comput. Sci..

[54]  Paul W. Goldberg,et al.  The complexity of computing a Nash equilibrium , 2006, STOC '06.

[55]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[56]  Claus-Peter Schnorr,et al.  Optimal Algorithms for Self-Reducible Problems , 1976, ICALP.

[57]  Y. Peres,et al.  Probability on Trees and Networks , 2017 .

[58]  Y. Freund,et al.  Adaptive game playing using multiplicative weights , 1999 .

[59]  Xavier Carreras,et al.  Structured Prediction Models via the Matrix-Tree Theorem , 2007, EMNLP.

[60]  Ambuj Tewari,et al.  On the Universality of Online Mirror Descent , 2011, NIPS.

[61]  L. Lovász,et al.  Geometric Algorithms and Combinatorial Optimization , 1981 .

[62]  Aranyak Mehta,et al.  Design is as Easy as Optimization , 2006, SIAM J. Discret. Math..

[63]  Kiyohito Nagano,et al.  A strongly polynomial algorithm for line search in submodular polyhedra , 2007, Discret. Optim..

[64]  Shuji Kijima,et al.  Online Prediction under Submodular Constraints , 2012, ALT.

[65]  Amin Saberi,et al.  Approximating nash equilibria using small-support strategies , 2007, EC '07.

[66]  Leslie G. Valiant,et al.  The Complexity of Computing the Permanent , 1979, Theor. Comput. Sci..