论文信息 - dotRL: A platform for rapid Reinforcement Learning methods development and validation

dotRL: A platform for rapid Reinforcement Learning methods development and validation

This paper introduces dotRL, a platform that enables fast implementation and testing of Reinforcement Learning algorithms against diverse environments. dotRL has been written under .NET framework and its main characteristics include: (i) adding a new learning algorithm or environment to the platform only requires implementing a simple interface, from then on it is ready to be coupled with other environments and algorithms, (ii) a set of tools is included that aid running and reporting experiments, (iii) a set of benchmark environments is included, with as demanding as Octopus-Arm and Half-Cheetah, (iv) the platform is available for instantaneous download, compilation, and execution, without libraries from different sources.

Pawel Wawrzynski | Bartosz Papis | P. Wawrzynski | B. Papis | Pawel Wawrzynski

[1] Ajay Kumar Tanwani,et al. Autonomous reinforcement learning with experience replay. , 2013, Neural networks : the official journal of the International Neural Network Society.

[2] Pawel Wawrzynski,et al. Real-time reinforcement learning by sequential Actor-Critics and experience replay , 2009, Neural Networks.

[3] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[4] Craig Boutilier,et al. Exploiting Structure in Policy Construction , 1995, IJCAI.

[5] Shigenobu Kobayashi,et al. An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function , 1998, ICML.

[6] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.

[7] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[8] Tim Kovacs,et al. On the analysis and design of software for reinforcement learning, with a survey of existing systems , 2011, Machine Learning.

[9] Evans,et al. Domain-driven design , 2003 .

[10] Andrew G. Barto,et al. Robot Weightlifting By Direct Policy Search , 2001, IJCAI.

[11] Kenneth O. Stanley,et al. Evolving a Single Scalable Controller for an Octopus Arm with a Variable Number of Segments , 2010, PPSN.

[12] Brian Tanner,et al. RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..

[13] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[14] Shigenobu Kobayashi,et al. Reinforcement Learning using Stochastic Gradient Algorithm and its Application to Robots , 1999 .

[15] Alexander Y. Bogdanov,et al. Optimal Control of a Double Inverted Pendulum on a Cart , 2004 .