论文信息 - simple_rl: Reproducible Reinforcement Learning in Python

simple_rl: Reproducible Reinforcement Learning in Python

Conducting reinforcement-learning experiments can be a complex and timely pro1 cess. A full experimental pipeline will typically consist of a simulation of an en2 vironment, an implementation of one or many learning algorithms, a variety of 3 additional components designed to facilitate the agent-environment interplay, and 4 any requisite analysis, plotting, and logging thereof. In light of this complexity, 5 this paper introduces simple rl1, a new open source library for carrying out rein6 forcement learning experiments in Python 2 and 3 with a focus on simplicity. The 7 goal of simple rl is to support seamless, reproducible methods for running rein8 forcement learning experiments. This paper gives an overview of the core design 9 philosophy of the package, how it differs from existing libraries, and showcases 10 its central features. 11 0 6 12 18 24 30 36 42 48 EpisRGH 1umEHr 0 5 10 15 20 25 30 35 Cu m ul aW iv H 5H w ar G 5HprRGucWiRQ: GriGwRrlG H 3 : 4 4-lHarQiQg 5aQGRm

David Abel | David Abel

[1] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[2] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.

[3] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..

[4] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[5] Brian Tanner,et al. RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..

[6] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[7] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[8] Lihong Li,et al. PAC-inspired Option Discovery in Lifelong Reinforcement Learning , 2014, ICML.

[9] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.

[10] Eric Jones,et al. SciPy: Open Source Scientific Tools for Python , 2001 .

[11] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.