论文信息 - Tunable Dynamics in Agent-Based Simulation using Multi-Objective Reinforcement Learning

Tunable Dynamics in Agent-Based Simulation using Multi-Objective Reinforcement Learning

Agent-based simulation is a powerful tool for studying complex systems of interacting agents. To achieve good results, the behavior models used for the agents must be of high quality. Traditionally these models have been handcrafted by domain experts. This is a difficult, expensive and time consuming process. In contrast, reinforcement learning allows agents to learn how to achieve their goals by interacting with the environment. However, after training the behavior of such agents is often static, i.e. it can no longer be affected by a human. This makes it difficult to adapt agent behavior to specific user needs, which may vary among different runs of the simulation. In this paper we address this problem by studying how multi-objective reinforcement learning can be used as a framework for building tunable agents, whose characteristics can be adjusted at runtime to promote adaptiveness and diversity in agent-based simulation. We propose an agent architecture that allows us to adapt popular deep reinforcement learning algorithms to multi-objective environments. We empirically show that our method allows us to train tunable agents that can approximate the policies of multiple species of agents.

Fredrik Heintz | Johan Källström | F. Heintz | J. Källström

[1] David V. Pynadath,et al. Semi-Automated Construction of Decision-Theoretic Models of Human Behavior , 2016, AAMAS.

[2] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..

[3] Alex Rogers,et al. Multi-Objective Calibration For Agent-Based Models , 2004 .

[4] John Yearwood,et al. On the Limitations of Scalarisation for Multi-objective Reinforcement Learning of Pareto Fronts , 2008, Australasian Conference on Artificial Intelligence.

[5] Shrisha Rao,et al. Agent-Based Modeling and Simulation of Mosquito-Borne Disease Transmission , 2017, AAMAS.

[6] Mykel J. Kochenderfer,et al. Real-time Prediction of Intermediate-Horizon Automotive Collision Risk , 2018, AAMAS.

[7] Jonathan P. Rowe,et al. Balancing Learning and Engagement in Game-Based Learning Environments with Multi-objective Reinforcement Learning , 2017, AIED.

[8] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[9] Andrei V. Kelarev,et al. Constructing Stochastic Mixture Policies for Episodic Multiobjective Reinforcement Learning Tasks , 2009, Australasian Conference on Artificial Intelligence.

[10] Charlotte Gerritsen,et al. Agent-Based Simulation of Offender Mobility: Integrating Activity Nodes from Location-Based Social Networks , 2018, AAMAS.

[11] Lin Padgham,et al. The Importance of Modelling Realistic Human Behaviour When Planning Evacuation Schedules , 2017, AAMAS.

[12] Shimon Whiteson,et al. Multi-Objective Deep Reinforcement Learning , 2016, ArXiv.

[13] Ann Nowé,et al. Multi-objective Reinforcement Learning for the Expected Utility of the Return , 2020 .

[14] Haitham Bou-Ammar,et al. Balancing Two-Player Stochastic Games with Soft Q-Learning , 2018, IJCAI.

[15] Sebastian Risi,et al. Learning macromanagement in starcraft from replays using deep learning , 2017, 2017 IEEE Conference on Computational Intelligence and Games (CIG).

[16] Samarth Swarup,et al. Behavior Model Calibration for Epidemic Simulations , 2018, AAMAS.

[17] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.