论文信息 - Apprentissage par renforcement factorisé pour le comportement de personnages non joueurs

Apprentissage par renforcement factorisé pour le comportement de personnages non joueurs

Dans cet article, nous appliquons une methode generale d'apprentissage par renforcement pour la mise au point automatique de comportements de personnages non joueurs d'un jeu video de tir a la premiere personne, Counter-Strike©. Le resultat de l'apprentissage est un ensemble d'arbres de decision representant de facon lisible un modele du probleme et la politique de decision des personnages. Enfin, nous discutons de la portee de notre methode pour la realisation d'architectures de decision pour les personnages non joueurs de jeux video.

[1] Denyse Baillargeon,et al. Bibliographie , 1929 .

[2] Douglas H. Fisher,et al. A Case Study of Incremental Concept Induction , 1986, AAAI.

[3] Keiji Kanazawa,et al. A model for reasoning about persistence and causation , 1989 .

[4] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[5] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[6] Craig Boutilier,et al. Exploiting Structure in Policy Construction , 1995, IJCAI.

[7] Jesse Hoey,et al. SPUDD: Stochastic Planning using Decision Diagrams , 1999, UAI.

[8] Craig Boutilier,et al. Stochastic dynamic programming with factored representations , 2000, Artif. Intell..

[9] John E. Laird,et al. Human-Level AI's Killer Application: Interactive Computer Games , 2000, AI Mag..

[10] A. Guillot,et al. CLASSIFIER SYSTEMS AS ' ANIMAT ' ARCHITECTURES FOR ACTION SELECTION IN MMORPG , 2002 .

[11] Carlos Guestrin,et al. Generalizing plans to new environments in relational MDPs , 2003, IJCAI 2003.

[12] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.

[13] Gabriel Robert. MHiCS, une architecture de sélection de l'action motivationnelle et hiérarchique à systèmes de classeurs pour personnages non joueurs adaptatifs , 2005 .

[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15] Vincent Corruble,et al. Extending Reinforcement Learning to Provide Dynamic Game Balancing , 2005 .

[16] Olivier Sigaud,et al. Chi-square Tests Driven Method for Learning the Structure of Factored MDPs , 2006, UAI.

[17] Olivier Sigaud,et al. Learning the structure of Factored Markov Decision Processes in reinforcement learning problems , 2006, ICML.

[18] Eric O. Postma,et al. Adaptive game AI with dynamic scripting , 2006, Machine Learning.

[19] David W. Aha,et al. Knowledge acquisition for adaptive game AI , 2007, Sci. Comput. Program..

[20] Thomas Degris. Apprentissage par renforcement dans les processus de décision Markoviens factorisés , 2007 .

[21] Olivier Sigaud,et al. Les systèmes de classeurs , 2007, Rev. d'Intelligence Artif..

[22] Charles A. G. Madeira. Agents adaptatifs dans les jeux de stratégie modernes : une approche fondée sur l'apprentissage par renforcement , 2007 .

[23] Jeff Orkin,et al. Applying Goal-Oriented Action Planning to Games , 2008 .

[24] Olivier Sigaud,et al. Processus décisionnels de Markov en intelligence artificielle , 2008 .

[25] Olivier Sigaud,et al. Exploiting Additive Structure in Factored MDPs for Reinforcement Learning , 2008, EWRL.

[26] A. Dasgupta. Asymptotic Theory of Statistics and Probability , 2008 .

[27] Martin V. Butz,et al. Anticipatory Learning Classifier Systems and Factored Reinforcement Learning , 2009, ABiALS.