dotRL: A platform for rapid Reinforcement Learning methods development and validation

This paper introduces dotRL, a platform that enables fast implementation and testing of Reinforcement Learning algorithms against diverse environments. dotRL has been written under .NET framework and its main characteristics include: (i) adding a new learning algorithm or environment to the platform only requires implementing a simple interface, from then on it is ready to be coupled with other environments and algorithms, (ii) a set of tools is included that aid running and reporting experiments, (iii) a set of benchmark environments is included, with as demanding as Octopus-Arm and Half-Cheetah, (iv) the platform is available for instantaneous download, compilation, and execution, without libraries from different sources.

[1]  Ajay Kumar Tanwani,et al.  Autonomous reinforcement learning with experience replay. , 2013, Neural networks : the official journal of the International Neural Network Society.

[2]  Pawel Wawrzynski,et al.  Real-time reinforcement learning by sequential Actor-Critics and experience replay , 2009, Neural Networks.

[3]  Kenji Doya,et al.  Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[4]  Craig Boutilier,et al.  Exploiting Structure in Policy Construction , 1995, IJCAI.

[5]  Shigenobu Kobayashi,et al.  An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function , 1998, ICML.

[6]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[7]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[8]  Tim Kovacs,et al.  On the analysis and design of software for reinforcement learning, with a survey of existing systems , 2011, Machine Learning.

[9]  Evans,et al.  Domain-driven design , 2003 .

[10]  Andrew G. Barto,et al.  Robot Weightlifting By Direct Policy Search , 2001, IJCAI.

[11]  Kenneth O. Stanley,et al.  Evolving a Single Scalable Controller for an Octopus Arm with a Variable Number of Segments , 2010, PPSN.

[12]  Brian Tanner,et al.  RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..

[13]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[14]  Shigenobu Kobayashi,et al.  Reinforcement Learning using Stochastic Gradient Algorithm and its Application to Robots , 1999 .

[15]  Alexander Y. Bogdanov,et al.  Optimal Control of a Double Inverted Pendulum on a Cart , 2004 .