Pearl: Parallel Evolutionary and Reinforcement Learning Library

Reinforcement learning is increasingly finding success across domains where the problem can be represented as a Markov decision process. Evolutionary computation algorithms have also proven successful in this domain, exhibiting similar performance to the generally more complex reinforcement learning. Whilst there exist many open-source reinforcement learning and evolutionary computation libraries, no publicly available library combines the two approaches for enhanced comparison, cooperation, or visualization. To this end, we have created Pearl (https://github.com/LondonNode/Pearl), an open source Python library designed to allow researchers to rapidly and conveniently perform optimized reinforcement learning, evolutionary computation and combinations of the two. The key features within Pearl include: modular and expandable components, opinionated module settings, Tensorboard integration, custom callbacks and comprehensive visualizations.

[1]  Jakub W. Pachocki,et al.  Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..

[2]  Joelle Pineau,et al.  Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program) , 2020, J. Mach. Learn. Res..

[3]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[4]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[5]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[6]  Fabio Pardo,et al.  Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking , 2020, ArXiv.

[7]  Konstantinos Poularakis,et al.  DQ Scheduler: Deep Reinforcement Learning Based Controller Synchronization in Distributed SDN , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[8]  John Schulman,et al.  Phasic Policy Gradient , 2020, ICML.

[9]  Alexei A. Efros,et al.  Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10]  Ning Qian,et al.  On the momentum term in gradient descent learning algorithms , 1999, Neural Networks.

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  Henry Zhu,et al.  Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[13]  Olivier Sigaud,et al.  CEM-RL: Combining evolutionary and gradient-based methods for policy search , 2018, ICLR.

[14]  Maria Hybinette,et al.  How to Evaluate Trading Strategies: Single Agent Market Replay or Multiple Agent Interactive Simulation? , 2019, ArXiv.

[15]  Yoshua Bengio,et al.  Equilibrated adaptive learning rates for non-convex optimization , 2015, NIPS.

[16]  Amjad Yousef Majid,et al.  Deep Reinforcement Learning Versus Evolution Strategies: A Comparative Survey , 2021, ArXiv.