Intelligent Adjustment of Game Properties at Run Time Using Multi-armed Bandits

Dynamic modification of game properties based on the preferences of players can be an essential factor of successful game design. This paper proposes a technique based on the multi-armed bandit (MAB) approach for intelligent and dynamic theme selection in a video game. The epsilon-greedy algorithm is exploited in order to implement the MAB approach and apply players’ preferences in the game. A 3D-Roll ball game with four different themes has been developed for the purpose of evaluating the efficacy of the proposed technique. In this game, the color of the gaming environment and the speed of a player are defined as two game properties that determine game themes. The results of a user study performed on this system show that our technique has the potential of being used as a toolkit for determining the preferences of players at real-time.

[1]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[2]  Mike Kuniavsky,et al.  Observing the User Experience: A Practitioner's Guide to User Research (Morgan Kaufmann Series in Interactive Technologies) (The Morgan Kaufmann Series in Interactive Technologies) , 2003 .

[3]  R. Munos,et al.  Best Arm Identification in Multi-Armed Bandits , 2010, COLT.

[4]  Ramon Lawrence,et al.  Using Multi-Arm Bandits to Optimize Game Play Metrics and Effective Game Design , 2016 .

[5]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[6]  Heather Desurvire,et al.  Using heuristics to evaluate the playability of games , 2004, CHI EA '04.

[7]  Sophie Jörg,et al.  Modeling and Animating Virtual Humans for Real-Time Applications , 2007 .

[8]  Michael I. Jordan,et al.  Learning with Mixtures of Trees , 2001, J. Mach. Learn. Res..

[9]  Michèle Sebag,et al.  Analyzing bandit-based adaptive operator selection mechanisms , 2010, Annals of Mathematics and Artificial Intelligence.

[10]  R. Agrawal Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[11]  Genshe Chen,et al.  Dynamic multi-arm bandit game based multi-agents spectrum sharing strategy design , 2017, 2017 IEEE/AIAA 36th Digital Avionics Systems Conference (DASC).

[12]  Andrew L. Liu,et al.  Intelligent demand response for electricity consumers: A multi-armed bandit game approach , 2017, 2017 19th International Conference on Intelligent System Application to Power Systems (ISAP).

[13]  Peta Wyeth,et al.  GameFlow: a model for evaluating player enjoyment in games , 2005, CIE.

[14]  Raphaël Féraud,et al.  Multi-armed bandit problem with known trend , 2015, Neurocomputing.

[15]  Yoones A. Sekhavat,et al.  Behavior Trees for Computer Games , 2017, Int. J. Artif. Intell. Tools.

[16]  Yoones A. Sekhavat,et al.  Projection-Based AR: Effective Visual Feedback in Gait Rehabilitation , 2018, IEEE Transactions on Human-Machine Systems.

[17]  P. Petta,et al.  Creating Personalities for Synthetic Actors: Towards Autonomous Personality Agents , 1997 .

[18]  Ron Kohavi,et al.  Trustworthy online controlled experiments: five puzzling outcomes explained , 2012, KDD.

[19]  Leslie Pack Kaelbling,et al.  Algorithms for multi-armed bandit problems , 2014, ArXiv.

[20]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[21]  Charlotte Wiberg,et al.  Game Usability Heuristics (PLAY) for Evaluating and Designing Better Games: The Next Iteration , 2009, HCI.

[22]  Krzysztof Z. Gajos,et al.  Preference elicitation for interface optimization , 2005, UIST.

[23]  Christoph Klimmt,et al.  Measuring User Responses to Interactive Stories: Towards a Standardized Assessment Tool , 2010, ICIDS.

[24]  Wouter M. Koolen,et al.  Maximin Action Identification: A New Bandit Framework for Games , 2016, COLT.

[25]  Nicolò Cesa-Bianchi,et al.  Combinatorial Bandits , 2012, COLT.

[26]  Kenneth R. Koedinger,et al.  Interface Design Optimization as a Multi-Armed Bandit Problem , 2016, CHI.

[27]  Atsuyoshi Nakamura,et al.  Noise Free Multi-armed Bandit Game , 2015, LATA.

[28]  Alan J. Dix,et al.  Using frustration in the design of adaptive videogames , 2004, ACE '04.

[29]  Ron Kohavi,et al.  Controlled experiments on the web: survey and practical guide , 2009, Data Mining and Knowledge Discovery.

[30]  D. Kort,et al.  The Game Experience Questionnaire , 2013 .

[31]  Archie C. Chapman,et al.  ε-first policies for budget-limited multi-armed bandits , 2010, AAAI 2010.

[32]  Robert Trappl,et al.  Creating Personalities for Synthetic Actors , 1997, Lecture Notes in Computer Science.

[33]  Giovanni Squillero,et al.  Operator Selection using Improved Dynamic Multi-Armed Bandit , 2015, GECCO.

[34]  Jichen Zhu,et al.  The SAM Algorithm for Analogy-Based Story Generation , 2011, AIIDE.

[35]  Angela Petit,et al.  Observing the User Experience: A Practitioner's Guide to User Research (Second Edition) [book review] , 2013, IEEE Trans. Prof. Commun..

[36]  Shivaram Kalyanakrishnan,et al.  Information Complexity in Bandit Subset Selection , 2013, COLT.

[37]  Zoran Popovic,et al.  Trading Off Scientific Knowledge and User Learning with Multi-Armed Bandits , 2014, EDM.

[38]  Laurent Jégou,et al.  How Color Properties Can Be Used to Elicit Emotions in Video Games , 2016, Int. J. Comput. Games Technol..

[39]  Bengt J. Nilsson,et al.  Bandit Algorithms for e-Commerce Recommender Systems: Extended Abstract , 2017, RecSys.

[40]  Aurélien Garivier,et al.  Learning the distribution with largest mean: two bandit frameworks , 2017, ArXiv.

[41]  Joseph S. Dumas,et al.  User-based evaluations , 2002 .

[42]  Yajun Wang,et al.  Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms , 2014, J. Mach. Learn. Res..

[43]  Julian Togelius,et al.  Evolving Game Skill-Depth using General Video Game AI agents , 2017, 2017 IEEE Congress on Evolutionary Computation (CEC).

[44]  ChenWei,et al.  Combinatorial multi-armed bandit and its extension to probabilistically triggered arms , 2016 .

[45]  Parag C. Pendharkar,et al.  Trading financial indices with reinforcement learning agents , 2018, Expert Syst. Appl..

[46]  Ruck Thawonmas,et al.  Detection of Landmarks for Clustering of Online-Game Players , 2007, Int. J. Virtual Real..

[47]  Santiago Ontañón,et al.  Combinatorial Multi-armed Bandits for Real-Time Strategy Games , 2017, J. Artif. Intell. Res..

[48]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[49]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[50]  Alessandro Canossa,et al.  Analyzing spatial user behavior in computer games using geographic information systems , 2009, MindTrek '09.

[51]  Alessandro Canossa,et al.  Towards gameplay analysis via gameplay metrics , 2009, MindTrek '09.

[52]  Alessandro Canossa,et al.  Defining personas in games using metrics , 2008, Future Play.

[53]  Yun-En Liu,et al.  Gameplay analysis through state projection , 2010, FDG.

[54]  Vadim Bulitko,et al.  Automated Planning and Player Modeling for Interactive Storytelling , 2015, IEEE Transactions on Computational Intelligence and AI in Games.

[55]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..