论文信息 - ToriLLE: Learning Environment for Hand-to-Hand Combat

ToriLLE: Learning Environment for Hand-to-Hand Combat

We present Toribash Learning Environment (To-riLLE), a learning environment for machine learning agents based on the video game Toribash. Toribash is a MuJoCo-like environment of two humanoid characters fighting each other hand-to-hand, controlled by changing actuation modes of the joints. Competitive nature of Toribash as well its focused domain provide a platform for evaluating self-play methods, and evaluating machine learning agents against human players. In this paper we describe the environment with ToriLLE’s capabilities and limitations, and experimentally show its applicability as a learning environment with baseline and human experiments. The source code of the environment and conducted experiments can be found at https://github.com/Miffyli/ToriLLE.

Ville Hautamäki | Anssi Kanervisto | Ville Hautamäki | A. Kanervisto

[1] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[2] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.

[3] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[4] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[5] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[6] Marwan Mattar,et al. Unity: A General Platform for Intelligent Agents , 2018, ArXiv.

[7] Shane Legg,et al. DeepMind Lab , 2016, ArXiv.

[8] Jakub W. Pachocki,et al. Emergent Complexity via Multi-Agent Competition , 2017, ICLR.

[9] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[10] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12] Wojciech Jaskowski,et al. ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[13] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.

[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15] Sergey Levine,et al. Learning to Walk via Deep Reinforcement Learning , 2018, Robotics: Science and Systems.

[16] Anthony Brabazon,et al. OPTIMISING OFFENSIVE MOVES IN TORIBASH USING A GENETIC ALGORITHM , 2010 .

[17] Marek Wydmuch,et al. ViZDoom Competitions: Playing Doom From Pixels , 2018, IEEE Transactions on Games.

[18] Florian Richoux,et al. TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games , 2016, ArXiv.

[19] Joshua B. Tenenbaum,et al. Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning , 2017, ArXiv.

[20] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[21] Arash Tavakoli,et al. Action Branching Architectures for Deep Reinforcement Learning , 2017, AAAI.

[22] Tom Schaul,et al. StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.

[23] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[24] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.