A competitive approach to game learning

Machine learning of game strategies has often depended on competitive methods that continually develop new strategies capable of defeating previous ones. We use a very inclusive definition of game and consider a framework within which a competitive algorithm makes repeated use of a strategy learnang component that can learn strategies which defeat a given set of opponents. We describe game learning in terms of sets M and X of first and second player strategies, and connect the model with more familiar models of concept learning. We show the importance of the ideas of teaching set [9] and specification number [2] k in this new context. The performance of several competitive algorithms is investigated, using both worst-case and randomized strategy learning algorithms. Our central result (Theorem 4) is a competitive algorithm that solves games in a total number of strategies polynomial in Ig(/fil ), lg(/X l), and k. Its use is demonstrated, including an application in concept learning with a new kind of counterexample oracle. We conclude with a complexity analysis of game learning, and list a number of new questions arising from this work.

[1]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[2]  Vasek Chvátal,et al.  A Greedy Heuristic for the Set-Covering Problem , 1979, Math. Oper. Res..

[3]  Leslie G. Valiant,et al.  Random Generation of Combinatorial Structures from a Uniform Distribution , 1986, Theor. Comput. Sci..

[4]  Dana Angluin Negative results for equivalence queries , 1990, Mach. Learn..

[5]  W. Daniel Hillis,et al.  Co-evolving parasites improve simulated evolution as an optimization procedure , 1990 .

[6]  Wolfgang Maass,et al.  On-line learning with an oblivious environment and the power of randomization , 1991, COLT '91.

[7]  M. Kearns,et al.  On the complexity of teaching , 1991, COLT '91.

[8]  John Shawe-Taylor,et al.  On exact specification by examples , 1992, COLT '92.

[9]  Terrence J. Sejnowski,et al.  Temporal Difference Learning of Position Evaluation in the Game of Go , 1993, NIPS.

[10]  Mihalis Yannakakis,et al.  On complexity as bounded rationality (extended abstract) , 1994, STOC '94.

[11]  Chuen-Tsai Sun,et al.  Genetic algorithm learning in game playing with multiple coaches , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[12]  Barak A. Pearlmutter,et al.  Playing the matching-shoulders lob-pass game with logarithmic regret , 1994, COLT '94.

[13]  Susan L. Epstein Toward an Ideal Trainer , 1994 .

[14]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[15]  Anthony V. Sebald,et al.  Minimax design of neural net controllers for highly uncertain plants , 1994, IEEE Trans. Neural Networks.

[16]  Karl Sims,et al.  Evolving 3d morphology and behavior by competition , 1994 .

[17]  Richard J. Lipton,et al.  Simple strategies for large zero-sum games with applications to complexity theory , 1994, STOC '94.

[18]  Sampath Kannan,et al.  Oracles and queries that are sufficient for exact learning (extended abstract) , 1994, COLT '94.

[19]  Claude-Nicolas Fiechter,et al.  Efficient reinforcement learning , 1994, COLT '94.

[20]  Sampath Kannan,et al.  Oracles and Queries That Are Sufficient for Exact Learning , 1996, J. Comput. Syst. Sci..

[21]  Geoffrey J. Gordon Stable Function Approximation in Dynamic Programming , 1995, ICML.

[22]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.

[23]  Ronitt Rubinfeld,et al.  Efficient algorithms for learning to play repeated games against computationally bounded adversaries , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[24]  Dave Cliff,et al.  Tracking the Red Queen: Measurements of Adaptive Progress in Co-Evolutionary Simulations , 1995, ECAL.

[25]  Richard K. Belew,et al.  Methods for Competitive Co-Evolution: Finding Opponents Worth Beating , 1995, ICGA.

[26]  Jordan B. Pollack,et al.  Coevolution of a Backgammon Player , 1996 .