论文信息 - Pearl: Parallel Evolutionary and Reinforcement Learning Library

Pearl: Parallel Evolutionary and Reinforcement Learning Library

Reinforcement learning is increasingly finding success across domains where the problem can be represented as a Markov decision process. Evolutionary computation algorithms have also proven successful in this domain, exhibiting similar performance to the generally more complex reinforcement learning. Whilst there exist many open-source reinforcement learning and evolutionary computation libraries, no publicly available library combines the two approaches for enhanced comparison, cooperation, or visualization. To this end, we have created Pearl (https://github.com/LondonNode/Pearl), an open source Python library designed to allow researchers to rapidly and conveniently perform optimized reinforcement learning, evolutionary computation and combinations of the two. The key features within Pearl include: modular and expandable components, opinionated module settings, Tensorboard integration, custom callbacks and comprehensive visualizations.

Anthony G. Constantinides | Danilo P. Mandic | Rohan Tangri

[1] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..

[2] Joelle Pineau,et al. Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program) , 2020, J. Mach. Learn. Res..

[3] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[4] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[5] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[6] Fabio Pardo,et al. Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking , 2020, ArXiv.

[7] Konstantinos Poularakis,et al. DQ Scheduler: Deep Reinforcement Learning Based Controller Synchronization in Distributed SDN , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[8] John Schulman,et al. Phasic Policy Gradient , 2020, ICML.

[9] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10] Ning Qian,et al. On the momentum term in gradient descent learning algorithms , 1999, Neural Networks.

[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[13] Olivier Sigaud,et al. CEM-RL: Combining evolutionary and gradient-based methods for policy search , 2018, ICLR.

[14] Maria Hybinette,et al. How to Evaluate Trading Strategies: Single Agent Market Replay or Multiple Agent Interactive Simulation? , 2019, ArXiv.

[15] Yoshua Bengio,et al. Equilibrated adaptive learning rates for non-convex optimization , 2015, NIPS.

[16] Amjad Yousef Majid,et al. Deep Reinforcement Learning Versus Evolution Strategies: A Comparative Survey , 2021, ArXiv.