Reinforcement Learning for Decision-Making in a Business Simulator

Business simulators are powerful tools for both supporting the decision-making process of business managers as well as for business education. An example is SIMBA (SIMulator for Business Administration), a powerful simulator which is currently used as a web-based platform for business education in different institutions. In this paper, we propose the application of reinforcement learning (RL) for the creation of intelligent agents that can manage virtual companies in SIMBA. This application is not trivial, given the particular intrinsic characteristics of SIMBA: it is a generalized domain where hundreds of parameters modify the domain behavior; it is a multi-agent domain where both cooperation and competition among different agents can coexist; it is required to set dozens of continuous decision variables for a given business decision, which is made only after the study of hundreds of continuous variables. We will demonstrate empirically that all these challenges can be overcome through the use of RL, showing results for different learning scenarios.

[1]  Fernando Fernández,et al.  Editorial: Modeling decisions for artificial intelligence , 2008 .

[2]  Javier García,et al.  SIMBA: A simulator for business education and research , 2010, Decis. Support Syst..

[3]  Pieter Abbeel,et al.  Learning vehicular dynamics, with application to modeling helicopters , 2005, NIPS.

[4]  ERICSSON CONSUMERLAB THE VOICE OF THE CONSUMER , 1975, The Lancet.

[5]  Sandip Sen,et al.  Co-adaptation in a Team , 1997 .

[6]  Marco Wiering,et al.  Learning Team Strategies With Multiple Policy-Sharing Agents: A Soccer Case Study , 1997 .

[7]  Peter Stone,et al.  Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..

[8]  Hitoshi Iba,et al.  Evolving multiple agents by genetic programming , 1999 .

[9]  José M. Merigó,et al.  The Uncertain Generalized OWA Operator and its Application to Financial Decision Making , 2011, Int. J. Inf. Technol. Decis. Mak..

[10]  A. J. Faria,et al.  Validating business gaming: Business game conformity with PIMS findings , 2005 .

[11]  G. Leask,et al.  Strategic groups, competitive groups and performance within the U.K. pharmaceutical industry: Improving our understanding of the competitive process , 2007 .

[12]  Karl Tuyls,et al.  An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games , 2005, Autonomous Agents and Multi-Agent Systems.

[13]  Manuela M. Veloso,et al.  Towards collaborative and adversarial learning: a case study in robotic soccer , 1998, Int. J. Hum. Comput. Stud..

[14]  L. Buşoniu,et al.  A comprehensive survey of multi-agent reinforcement learning , 2011 .

[15]  J. A. G. Griffith,et al.  The voice of the consumer , 1950 .

[16]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[17]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[18]  ndez,et al.  Multi-agent reinforcement learning for simulating pedestrian navigation , 2011, ALA-11 2011.

[19]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[20]  Ian Witten,et al.  Data Mining , 2000 .

[21]  Kiran Kumar Ravulapati,et al.  A reinforcement learning approach to stochastic business games , 2004 .

[22]  James S. Albus,et al.  I A New Approach to Manipulator Control: The I Cerebellar Model Articulation Controller , 1975 .

[23]  Jiri Pospichal,et al.  An Emergence Of Game Strategy In Multiagent Systems , 2004, Int. J. Comput. Intell. Appl..

[24]  Brian Tanner,et al.  RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..

[25]  Emanuel Falkenauer,et al.  On Method Overfitting , 1998, J. Heuristics.

[26]  Carlyle A. J. Farrell Perceived Effectiveness of Simulations in International Business Pedagogy , 2005 .

[27]  Toshiyuki Sueyoshi,et al.  Financial Ratio Analysis of the Electric Power Industry , 2005, Asia Pac. J. Oper. Res..

[28]  Shimon Whiteson,et al.  Generalized Domains for Empirical Evaluations in Reinforcement Learning , 2009 .

[29]  A. J. Faria,et al.  A Survey of Simulation Game Users, Former-Users, and Never-Users , 2004 .

[30]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[31]  Hitoshi Iba,et al.  Evolutionary Learning of Communicating Agents , 1998, Inf. Sci..

[32]  Ashwin Ram,et al.  Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..

[33]  D. Midgley,et al.  Breeding competitive strategies , 1997 .

[34]  Fernando Fernández,et al.  Multi-agent Reinforcement Learning for Simulating Pedestrian Navigation , 2011, ALA.

[35]  R E Miles,et al.  Organizational strategy, structure, and process. , 1978, Academy of management review. Academy of Management.

[36]  Alex M. Andrew,et al.  ROBOT LEARNING, edited by Jonathan H. Connell and Sridhar Mahadevan, Kluwer, Boston, 1993/1997, xii+240 pp., ISBN 0-7923-9365-1 (Hardback, 218.00 Guilders, $120.00, £89.95). , 1999, Robotica (Cambridge. Print).

[37]  Gary J. Summers Today’s Business Simulation Industry , 2004 .

[38]  Fernando Fernández,et al.  Two steps reinforcement learning , 2008, Int. J. Intell. Syst..

[39]  Manuela M. Veloso,et al.  Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[40]  James S. Albus,et al.  New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .

[41]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[42]  Takao Terano,et al.  Learning agents in a business simulator , 2003, Proceedings 2003 IEEE International Symposium on Computational Intelligence in Robotics and Automation. Computational Intelligence in Robotics and Automation for the New Millennium (Cat. No.03EX694).

[43]  Lee Spector,et al.  Evolving teamwork and coordination with genetic programming , 1996 .

[44]  Shunming Zhang,et al.  Optimal Timing and Equilibrium Price for Soe Property Rights Transfer under Imperfect Information , 2011, Int. J. Inf. Technol. Decis. Mak..

[45]  Javier García,et al.  Learning Virtual Agents for Decision-Making in Business Simulators , 2010, MALLOW.

[46]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .