A Bayesian optimization approach to find Nash equilibria

Game theory finds nowadays a broad range of applications in engineering and machine learning. However, in a derivative-free, expensive black-box context, very few algorithmic solutions are available to find game equilibria. Here, we propose a novel Gaussian-process based approach for solving games in this context. We follow a classical Bayesian optimization framework, with sequential sampling decisions based on acquisition functions. Two strategies are proposed, based either on the probability of achieving equilibrium or on the stepwise uncertainty reduction paradigm. Practical and numerical aspects are discussed in order to enhance the scalability and reduce computation time. Our approach is evaluated on several synthetic game problems with varying number of players and decision space dimensions. We show that equilibria can be found reliably for a fraction of the cost (in terms of black-box evaluations) compared to classical, derivative-based algorithms. The method is available in the R package GPGame available on CRAN at https://cran.r-project.org/package=GPGame.

[1]  Neil D. Lawrence,et al.  GLASSES: Relieving The Myopia Of Bayesian Optimisation , 2015, AISTATS.

[2]  Christian Kanzow,et al.  Augmented Lagrangian Methods for the Solution of Generalized Nash Equilibrium Problems , 2016, SIAM J. Optim..

[3]  Michael H. Bowling,et al.  Data Biased Robust Counter Strategies , 2009, AISTATS.

[4]  Abderrahmane Habbal,et al.  Multidisciplinary topology optimization solved as a Nash game , 2004 .

[5]  Matthew W. Hoffman,et al.  A General Framework for Constrained Bayesian Optimization using Information-based Search , 2015, J. Mach. Learn. Res..

[6]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[7]  Neil D. Lawrence,et al.  Kernels for Vector-Valued Functions: a Review , 2011, Found. Trends Mach. Learn..

[8]  A. Genz,et al.  Computation of Multivariate Normal and t Probabilities , 2009 .

[9]  Andrew Gordon Wilson,et al.  Kernel Interpolation for Scalable Structured Gaussian Processes (KISS-GP) , 2015, ICML.

[10]  Philipp Hennig,et al.  Entropy Search for Information-Efficient Global Optimization , 2011, J. Mach. Learn. Res..

[11]  David Ginsbourger,et al.  Fast Computation of the Multi-Points Expected Improvement with Applications in Batch Selection , 2013, LION.

[12]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[13]  Dorit Hammerling,et al.  A Case Study Competition Among Methods for Analyzing Large Spatial Data , 2017, Journal of Agricultural, Biological and Environmental Statistics.

[14]  R. Gibbons Game theory for applied economists , 1992 .

[15]  Antanas Zilinskas,et al.  Stochastic Global Optimization: A Review on the Occasion of 25 Years of Informatica , 2016, Informatica.

[16]  Wolfgang Ponweiser,et al.  On Expected-Improvement Criteria for Model-based Multi-objective Optimization , 2010, PPSN.

[17]  Yves Deville,et al.  DiceKriging, DiceOptim: Two R Packages for the Analysis of Computer Experiments by Kriging-Based Metamodeling and Optimization , 2012 .

[18]  Ling Li,et al.  Sequential design of computer experiments for the estimation of a probability of failure , 2010, Statistics and Computing.

[19]  P. Varaiya,et al.  Differential games , 1971 .

[20]  Victor Picheny,et al.  A Stepwise uncertainty reduction approach to constrained global optimization , 2014, AISTATS.

[21]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[22]  G. Gary Wang,et al.  Review of Metamodeling Techniques in Support of Engineering Design Optimization , 2007, DAC 2006.

[23]  T. Hothorn,et al.  Multivariate Normal and t Distributions , 2016 .

[24]  T. Basar,et al.  Relaxation techniques and asynchronous algorithms for on-line computation of noncooperative equilibria , 1987, 26th IEEE Conference on Decision and Control.

[25]  Wouter M. Koolen,et al.  Maximin Action Identification: A New Bandit Framework for Games , 2016, COLT.

[26]  Francisco Facchinei,et al.  Generalized Nash Equilibrium Problems , 2010, Ann. Oper. Res..

[27]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[28]  Sean Luke,et al.  Lenient Learning in Independent-Learner Stochastic Cooperative Games , 2016, J. Mach. Learn. Res..

[29]  Eric Walter,et al.  An informational approach to the global optimization of expensive-to-evaluate functions , 2006, J. Glob. Optim..

[30]  François Bachoc,et al.  Nested Kriging predictions for datasets with a large number of observations , 2016, Statistics and Computing.

[31]  Avner Friedman,et al.  Stochastic differential games , 1972 .

[32]  D. Ginsbourger,et al.  Towards Gaussian Process-based Optimization with Finite Time Horizon , 2010 .

[33]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[34]  Hyeonsoo Yeo,et al.  Investigation of Rotor Vibratory Loads of a UH-60A Individual Blade Control System , 2016 .

[35]  James M. Parr,et al.  Improvement criteria for constraint handling and multiobjective optimization , 2013 .

[36]  Peter Stone,et al.  A polynomial-time nash equilibrium algorithm for repeated games , 2003, EC '03.

[37]  J. Mockus Bayesian Approach to Global Optimization: Theory and Applications , 1989 .

[38]  Douglas W. Nychka,et al.  Methods for Analyzing Large Spatial Data: A Review and Comparison , 2017 .

[39]  Robert B. Gramacy,et al.  Sequential Design for Optimal Stopping Problems , 2013, SIAM J. Financial Math..

[40]  Daniel W. Apley,et al.  Local Gaussian Process Approximation for Large Computer Experiments , 2013, 1303.0383.

[41]  Han Lin Shang,et al.  Bootstrap methods for stationary functional time series , 2016, Stat. Comput..

[42]  Tuomas Sandholm,et al.  Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Hold'em Agent , 2015, AAAI Workshop: Computer Poker and Imperfect Information.

[43]  R. Rubinstein,et al.  On relaxation algorithms in computation of noncooperative equilibria , 1994, IEEE Trans. Autom. Control..

[44]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[45]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[46]  Donald R. Jones,et al.  A Taxonomy of Global Optimization Methods Based on Response Surfaces , 2001, J. Glob. Optim..

[47]  Noel A Cressie,et al.  Statistics for Spatial Data. , 1992 .

[48]  Donald Geman,et al.  Graded Learning for Object Detection , 1999 .

[49]  J. Rosenmüller On a Generalization of the Lemke–Howson Algorithm to Noncooperative N-Person Games , 1971 .

[50]  Michael H. Bowling,et al.  No-Regret Learning in Extensive-Form Games with Imperfect Recall , 2012, ICML.

[51]  Richard J. Beckman,et al.  A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code , 2000, Technometrics.

[52]  J. Bect,et al.  A supermartingale approach to Gaussian process based sequential design of experiments , 2016, Bernoulli.

[53]  David Ginsbourger,et al.  Fast Update of Conditional Simulation Ensembles , 2015, Mathematical Geosciences.

[54]  Hubertus Th. Jongen,et al.  On Structure and Computation of Generalized Nash Equilibria , 2013, SIAM J. Optim..

[55]  Matthew Plumlee,et al.  Fast Prediction of Deterministic Functions Using Sparse Grid Experimental Designs , 2014, 1402.6350.

[56]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[57]  P. Diggle,et al.  Model‐based geostatistics , 2007 .

[58]  M. D. McKay,et al.  A comparison of three methods for selecting values of input variables in the analysis of output from a computer code , 2000 .

[59]  Abderrahmane Habbal,et al.  Neumann-Dirichlet Nash Strategies for the Solution of Elliptic Cauchy Problems , 2013, SIAM J. Control. Optim..

[60]  Sylvain Sorin,et al.  Stochastic Games and Applications , 2003 .

[61]  Adelchi Azzalini,et al.  Combining local and global smoothing in multivariate density estimation , 2016, 1610.02372.

[62]  Tamer Basar,et al.  Distributed algorithms for the computation of noncooperative equilibria , 1987, Autom..

[63]  J. Harsanyi Games with randomly disturbed payoffs: A new rationale for mixed-strategy equilibrium points , 1973 .

[64]  Ryoichi Nishimura,et al.  Robust Nash equilibria in N-person non-cooperative games: Uniqueness and reformulation , 2008 .

[65]  Matthew W. Hoffman,et al.  Predictive Entropy Search for Efficient Global Optimization of Black-box Functions , 2014, NIPS.

[66]  Eric Moulines,et al.  Sequential Design of Computer Experiments for the Assessment of Fetus Exposure to Electromagnetic Fields , 2016, Technometrics.

[67]  Jean-Antoine Désidéri,et al.  Concurrent Aerodynamic Optimization of Rotor Blades Using a Nash Game Method , 2016 .