论文信息 - RLPy: a value-function-based reinforcement learning framework for education and research

RLPy: a value-function-based reinforcement learning framework for education and research

RLPy is an object-oriented reinforcement learning software package with a focus on value-function-based methods using linear function approximation and discrete actions. The framework was designed for both educational and research purposes. It provides a rich library of fine-grained, easily exchangeable components for learning agents (e.g., policies or representations of value functions), facilitating recently increased specialization in reinforcement learning. RLPy is written in Python to allow fast prototyping, but is also suitable for large-scale experiments through its built-in support for optimized numerical libraries and parallelization. Code profiling, domain visualizations, and data analysis are integrated in a self-contained package available under the Modified BSD License at http://github.com/rlpy/rlpy. All of these properties allow users to compare various reinforcement learning algorithms with little effort.

[1] G. vanRossum,et al. Interactively testing remote servers using the Python programming language , 1991 .

[2] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[4] Shie Mannor,et al. Automatic basis function construction for approximate dynamic programming and reinforcement learning , 2006, ICML.

[5] Frank Kirchner,et al. Performance evaluation of EANT in the robocup keepaway benchmark , 2007, ICMLA 2007.

[6] Lihong Li,et al. Analyzing feature generation for value-function approximation , 2007, ICML '07.

[7] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..

[8] Michael L. Littman,et al. Multi-resolution Exploration in Continuous Spaces , 2008, NIPS.

[9] Brian Tanner,et al. RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..

[10] Bart De Schutter,et al. Approximate Dynamic Programming and Reinforcement Learning , 2010, Interactive Collaborative Information Systems.

[11] Alborz Geramifard,et al. Online Discovery of Feature Dependencies , 2011, ICML.

[12] Lihong Li,et al. Sample Complexity Bounds of Exploration , 2012, Reinforcement Learning.

[13] Will Dabney,et al. RLPy : A Reinforcement Learning Framework for Education and Research , 2013 .

[14] David D. Cox,et al. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[15] Hervé Frezza-Buet,et al. A C++ template-based reinforcement learning library: fitting the code to the mathematics , 2013, J. Mach. Learn. Res..

[16] Shie Mannor,et al. Scaling Up Approximate Value Iteration with Options: Better Policies with Fewer Iterations , 2014, ICML.

[17] Peter Kulchyski. and , 2015 .