Programmation et apprentissage bayésien pour les jeux vidéo multi-joueurs, application à l'intelligence artificielle de jeux de stratégies temps-réel. (Bayesian Programming and Learning for Multi-Player Video Games, Application to RTS AI)

Cette these explore l'utilisation des modeles bayesiens dans les IA de jeux video multi-joueurs, particulierement l'IA des jeux de strategie en temps reel (STR). Les jeux video se situent entre la robotique et la simulation totale, car les autres joueurs ne sont pas simules, et l'IA n'a pas de controle sur la simulation. Les jeux de STR demandent simultanement d'effectuer des actions reactives (controle d'unites) et de prendre des decisions strategiques (technologiques, economiques) et tactiques (spatiales, temporelles). Nous avons utilise la modelisation bayesienne comme une alternative a la logique (booleenne), etant capable de travailler avec des informations incompletes, et donc incertaines. En effet, la specification incomplete des comportement "scriptes", ou la specification incomplete des etats possibles dans la recherche de plans, demandent une solution qui peut gerer cette incertitude. L'apprentissage artificiel aide a reduire la complexite de specifier de tels modeles. Nous montrons que la programmation bayesienne peut integrer toutes sortes de sources d'incertitudes (etats caches, intentions, stochasticite) par la realisation d'un joueur de StarCraft completement robotique. Les distributions de probabilite sont un moyen de transporter, sans perte, l'information que l'on a et qui peut representer au choix: des contraintes, une connaissance partielle, une estimation de l'espace des etats et l'incompletude du modele lui-meme. Dans la premiere partie de cette these, nous detaillons les solutions actuelles aux problemes qui se posent lors de la realisation d'une IA de jeu multi-joueur, en donnant un apercu des caracteristiques calculatoires et cognitives complexes des principaux types de jeux. En partant de ce constat, nous resumons les categories transversales de problemes, et nous introduisons comment elles peuvent etre resolues par la modelisation bayesienne. Nous expliquons alors comment construire un programme bayesien en partant de connaissances et d'observations du domaine a travers un exemple simple de jeu de role. Dans la deuxieme partie de la these, nous detaillons l'application de cette approche a l'IA de STR, ainsi que les modeles auxquels nous sommes parvenus. Pour le comportement reactif (micro-management), nous presentons un controleur multi-agent decentralise et temps reel inspire de la fusion sensori-motrice. Ensuite, nous accomplissons les adaptation dynamiques de nos strategies et tactiques a celles de l'adversaire en le modelisant a l'aide de l'apprentissage artificiel (supervise et non supervise) depuis des traces de joueurs de haut niveau. Ces modeles probabilistes de joueurs peuvent etre utilises a la fois pour la prediction des decisions/actions de l'adversaire, mais aussi a nous-meme pour la prise de decision si on substitue les entrees par les notres. Enfin, nous expliquons l'architecture de notre joueur robotique de StarCraft, et nous precisions quelques details techniques d'implementation. Au dela des modeles et de leurs implementations, il y a trois contributions principales: la reconnaissance de plan et la modelisation de l'adversaire par apprentissage artificiel, en tirant partie de la structure du jeu, la prise de decision multi-echelles en presence d'informations incertaines, et l'integration des modeles bayesiens au controle temps reel d'un joueur artificiel.

[1]  Stefan J. Johansson,et al.  Dealing with fog of war in a Real Time Strategy game environment , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.

[2]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[3]  Nathan R. Sturtevant,et al.  Benchmarks for Grid-Based Pathfinding , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[4]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[5]  Gabriel Synnaeve,et al.  A Bayesian model for RTS units control applied to StarCraft , 2011, 2011 IEEE Conference on Computational Intelligence and Games (CIG'11).

[6]  Ben George Weber,et al.  Integrating Learning in a Multi-Scale Agent , 2012 .

[7]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[8]  Froduald Kabanza,et al.  Opponent Behaviour Recognition for Real-Time Strategy Games , 2010, Plan, Activity, and Intent Recognition.

[9]  Ronan Le Hy Programmation et apprentissage bayésien de comportements pour des personnages synthétiques : applications aux personnages de jeux vidéos , 2007 .

[10]  L. V. Allis,et al.  Searching for solutions in games and artificial intelligence , 1994 .

[11]  Pierre Bessière,et al.  Teaching Bayesian behaviours to video game characters , 2003, Robotics Auton. Syst..

[12]  Luke Perkins,et al.  Terrain Analysis in Real-Time Strategy Games: An Integrated Approach to Choke Point Detection and Region Decomposition , 2010, AIIDE.

[13]  Julian Togelius,et al.  The WCCI 2008 simulated car racing competition , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.

[14]  Philip Hingston,et al.  A Turing Test for Computer Game Bots , 2009, IEEE Transactions on Computational Intelligence and AI in Games.

[15]  Léon J. M. Rothkrantz,et al.  Artificial Player for Quake III Arena , 2002, Int. J. Intell. Games Simul..

[16]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[17]  Joël Ouaknine,et al.  Sudoku as a SAT Problem , 2006, ISAIM.

[18]  B. Craenen,et al.  A Taxonomy of Video Games and AI , 2009 .

[19]  Robert B. Ash,et al.  Monopoly as a Markov Process , 1972 .

[20]  Michael Mateas,et al.  A data mining approach to strategy prediction , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[21]  Jonathan Schaeffer,et al.  Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.

[22]  E. Rowland Theory of Games and Economic Behavior , 1946, Nature.

[23]  R. T. Cox Probability, frequency and reasonable expectation , 1990 .

[24]  Linda Smail,et al.  Exact and approximate inference in ProBT , 2007, Rev. d'Intelligence Artif..

[25]  Kenneth D. Forbus,et al.  How qualitative spatial reasoning can improve strategy game AIs , 2002, IEEE Intelligent Systems.

[26]  Hector Muñoz-Avila,et al.  Hierarchical Plan Representations for Encoding Strategic Game AI , 2005, AIIDE.

[27]  Eric van Damme,et al.  Non-Cooperative Games , 2000 .

[28]  Pierre Bessière,et al.  A Bayesian Tactician , 2012, CGAMES 2012.

[29]  Christian Laugier,et al.  Probabilistic Reasoning and Decision Making in Sensory-Motor Systems , 2008, Springer Tracts in Advanced Robotics.

[30]  H. Jaap van den Herik,et al.  Opponent modelling for case-based adaptive game AI , 2009, Entertain. Comput..

[31]  Murray Campbell,et al.  Deep Blue , 2002, Artif. Intell..

[32]  Stefan J. Johansson,et al.  A Multiagent Potential Field-Based Bot for Real-Time Strategy Games , 2009, Int. J. Comput. Games Technol..

[33]  Wayne F. Cascio,et al.  What is Strategy , 2012 .

[34]  Santiago Ontañón,et al.  Case-Based Planning and Execution for Real-Time Strategy Games , 2007, ICCBR.

[35]  Ashwin Ram,et al.  Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL , 2007, IJCAI.

[36]  Michael Buro,et al.  Call for AI Research in RTS Games , 2004 .

[37]  John E. Laird,et al.  Human-Level AI's Killer Application: Interactive Computer Games , 2000, AI Mag..

[38]  S. Salzberg,et al.  INSTANCE-BASED LEARNING : Nearest Neighbour with Generalisation , 1995 .

[39]  Simon M. Lucas,et al.  Ms Pac-Man versus Ghost Team CEC 2011 competition , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[40]  David Lichtenstein,et al.  GO is pspace hard , 1978, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[41]  Alan Fern,et al.  UCT for Tactical Assault Planning in Real-Time Strategy Games , 2009, IJCAI.

[42]  Julian Togelius,et al.  Towards Automatic Personalized Content Generation for Platform Games , 2010, AIIDE.

[43]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[44]  John Michael Robson,et al.  The Complexity of Go , 1983, IFIP Congress.

[45]  Robert P. Goldman,et al.  A probabilistic plan recognition algorithm based on plan tree grammars , 2009, Artif. Intell..

[46]  Pieter Spronck,et al.  Opponent Modeling in Real-Time Strategy Games , 2007, GAMEON.

[47]  Santiago Ontañón,et al.  Situation Assessment for Plan Retrieval in Real-Time Strategy Games , 2008, ECCBR.

[48]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[49]  Gabriel Synnaeve,et al.  A Bayesian model for opening prediction in RTS games with application to StarCraft , 2011, 2011 IEEE Conference on Computational Intelligence and Games (CIG'11).

[50]  David W. Aha,et al.  Learning to Win: Case-Based Plan Selection in a Real-Time Strategy Game , 2005, Künstliche Intell..

[51]  Zoran Budimac,et al.  CBR: Case-based reasoning , 2003 .

[52]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[53]  Thomas G. Dietterich,et al.  Learning Probabilistic Behavior Models in Real-Time Strategy Games , 2011, AIIDE.

[54]  John Tromp,et al.  Combinatorics of Go , 2006, Computers and Games.

[55]  Eric O. Postma,et al.  TEAM: The Team-Oriented Evolutionary Adaptability Mechanism , 2004, ICEC.

[56]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[57]  Björn Jónsson Representing Uncertainty in RTS Games , 2012 .

[58]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[59]  Duane Szafron,et al.  An Architecture for Game Behavior AI: Behavior Multi-Queues , 2009, AIIDE.

[60]  Santiago Ontañón,et al.  Learning from Demonstration and Case-Based Planning for Real-Time Strategy Games , 2008, Soft Computing Applications in Industry.

[61]  Ingrid Zukerman,et al.  Bayesian Models for Keyhole Plan Recognition in an Adventure Game , 2004, User Modeling and User-Adapted Interaction.

[62]  Helly Grundbegriffe der Wahrscheinlichkeitsrechnung , 1936 .

[63]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[64]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[65]  Aske Plaat,et al.  Research, Re: Search and Re-Search , 1996, J. Int. Comput. Games Assoc..

[66]  Curt Bererton,et al.  State Estimation for Game AI Using Particle Filters , 2004 .

[67]  Pierre Bessière,et al.  Bayesian Modeling of a Human MMORPG Player , 2010, ArXiv.

[68]  Colin Frayn An Evolutionary Approach to Strategies for the Game of Monopoly , 2005, CIG.

[69]  Hector Geffner,et al.  Plan Recognition as Planning , 2009, IJCAI.

[70]  Sebastian Thrun,et al.  Particle Filters in Robotics , 2002, UAI.

[71]  John E. Laird,et al.  It knows what you're going to do: adding anticipation to a Quakebot , 2001, AGENTS '01.

[72]  Philippe Leray,et al.  BAYESIAN NETWORK STRUCTURAL LEARNING AND INCOMPLETE DATA , 2005 .

[73]  Olivier Bartheye,et al.  A Real-Time PDDL-based Planning Component for Video Games , 2009, AIIDE.

[74]  Pierre Bessière,et al.  Special tactics: A Bayesian approach to tactical decision-making , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).

[75]  Vincent Corruble,et al.  Designing a Reinforcement Learning-based Adaptive AI for Large-Scale Strategy Games , 2006, AIIDE.

[76]  Bhaskara Marthi,et al.  Concurrent Hierarchical Reinforcement Learning , 2005, IJCAI.

[77]  Jonathan Schaeffer,et al.  Monte Carlo Planning in RTS Games , 2005, CIG.

[78]  Leslie Pack Kaelbling,et al.  Algorithms for multi-armed bandit problems , 2014, ArXiv.

[79]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[80]  Pierre Bessière,et al.  A Survey of Probabilistic Models Using the Bayesian Programming Methodology as a Unifying Framework , 2003 .

[81]  Santiago Ontañón,et al.  Meta-Level Behavior Adaptation in Real-Time Strategy Games , 2010, CGAMES 2010.

[82]  B. D. Finetti La prévision : ses lois logiques, ses sources subjectives , 1937 .

[83]  Aviezri S. Fraenkel,et al.  Computing a Perfect Strategy for n x n Chess Requires Time Exponential in n , 1981, J. Comb. Theory, Ser. A.

[84]  Robert P. Goldman,et al.  A Bayesian Model of Plan Recognition , 1993, Artif. Intell..

[85]  Stefan J. Johansson,et al.  A study on human like characteristics in real time strategy games , 2010, Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games.

[86]  Michael Buro,et al.  Efficient Triangulation-Based Pathfinding , 2006, AAAI.

[87]  Sylvain Gelly,et al.  Exploration exploitation in Go: UCT for Monte-Carlo Go , 2006, NIPS 2006.

[88]  R. Bellman A Markovian Decision Process , 1957 .

[89]  Risto Miikkulainen,et al.  UT2: Human-like behavior via neuroevolution of combat behavior and replay of human traces , 2011, CIG.

[90]  Leonardo Garrido,et al.  Fuzzy Case-Based Reasoning for Managing Strategic and Tactical Reasoning in StarCraft , 2011, MICAI.

[91]  Stuart J. Russell,et al.  Metareasoning for Monte Carlo Tree Search , 2011 .

[92]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[93]  Florence Chee,et al.  Understanding Korean experiences of online game hype, identity, and the menace of the "Wang-tta" , 2005, DiGRA Conference.

[94]  Arnav Jhala,et al.  Reactive planning idioms for multi-scale game AI , 2010, Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games.

[95]  Erik D. Demaine,et al.  Games, puzzles and computation , 2009 .

[96]  David W. Aha,et al.  Automatically Generating Game Tactics through Evolutionary Learning , 2006, AI Mag..

[97]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[98]  Nicola Beume,et al.  Intelligent moving of groups in real-time strategy games , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.

[99]  Kevin B. Korb,et al.  Bayesian Poker , 1999, UAI.

[100]  Bruno Scherrer,et al.  Building Controllers for Tetris , 2009, J. Int. Comput. Games Assoc..

[101]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[102]  Jonathan Schaeffer,et al.  Poker as a Testbed for Machine Intelligence Research , 1998 .

[103]  Adrian E. Raftery,et al.  MCLUST Version 3 for R: Normal Mixture Modeling and Model-Based Clustering † , 2007 .

[104]  Marc J. V. Ponsen,et al.  Improving Adaptive Game Ai with Evolutionary Learning , 2004 .

[105]  Vadim Bulitko,et al.  An evaluation of models for predicting opponent positions in first-person shooter video games , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.

[106]  Arnav Jhala,et al.  Applying Goal-Driven Autonomy to StarCraft , 2010, AIIDE.

[107]  Pierre-Yves Oudeyer,et al.  R-IAC: Robust Intrinsically Motivated Exploration and Active Learning , 2009, IEEE Transactions on Autonomous Mental Development.

[108]  Olivier Teytaud,et al.  Modification of UCT with Patterns in Monte-Carlo Go , 2006 .

[109]  Abdennour El Rhalibi,et al.  Machine learning techniques for FPS in Q3 , 2004, ACE '04.

[110]  Lihong Li,et al.  A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.

[111]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..

[112]  Sushil J. Louis,et al.  Co-Evolving Influence Map Tree Based Strategy Game Players , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.

[113]  Hua Ai,et al.  Robust and Authorable Multiplayer Storytelling Experiences , 2011, AIIDE.

[114]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[115]  Avi Pfeffer,et al.  Representations and Solutions for Game-Theoretic Problems , 1997, Artif. Intell..

[116]  James A. Hendler,et al.  HTN Planning: Complexity and Expressivity , 1994, AAAI.

[117]  Sushil J. Louis,et al.  Evolving coordinated spatial tactics for autonomous entities using influence maps , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[118]  Sushil J. Louis,et al.  Using co-evolved RTS opponents to teach spatial tactics , 2010, Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games.

[119]  Chuen-Tsai Sun,et al.  Building a player strategy model by analyzing replays of real-time strategy games , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[120]  M. Tribus,et al.  Probability theory: the logic of science , 2003 .

[121]  Philippe Leray,et al.  Étude Comparative d’Algorithmes d’Apprentissage de Structure dans les Réseaux Bayésiens , 2004 .

[122]  Michael H. Bowling,et al.  Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.

[123]  Giovanni Viglietta Gaming Is a Hard Job, but Someone Has to Do It! , 2013, Theory of Computing Systems.

[124]  Alexander Reinefeld,et al.  An Improvement to the Scout Tree Search Algorithm , 1983, J. Int. Comput. Games Assoc..

[125]  Michèle Sebag,et al.  The grand challenge of computer Go , 2012, Commun. ACM.

[126]  ierre,et al.  Bayesian Robot Programming , 2022 .

[127]  Jeff Orkin,et al.  Three States and a Plan: The A.I. of F.E.A.R. , 2006 .

[128]  Frank Dignum,et al.  Evolutionary neural networks for Non-Player Characters in Quake III , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[129]  D. Hunter Hale,et al.  Automatically-generated Convex Region Decomposition for Real-time Spatial Agent Navigation in Virtual Worlds , 2021, AIIDE.

[130]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[131]  Pierre Bessière,et al.  A Bayesian Model for Plan Recognition in RTS Games Applied to StarCraft , 2011, AIIDE.

[132]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[133]  Craig W. Reynolds Steering Behaviors For Autonomous Characters , 1999 .

[134]  Arnav Jhala,et al.  A Particle Model for State Estimation in Real-Time Strategy Games , 2011, AIIDE.

[135]  Nicola Beume,et al.  Towards Intelligent Team Composition and Maneuvering in Real-Time Strategy Games , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[136]  Julian Togelius,et al.  The 2009 Mario AI Competition , 2010, IEEE Congress on Evolutionary Computation.

[137]  Pierre Bessière,et al.  A Dataset for StarCraft AI & an Example of Armies Clustering , 2012, ArXiv.