论文信息 - Lyceum: An efficient and scalable ecosystem for robot learning

Lyceum: An efficient and scalable ecosystem for robot learning

We introduce Lyceum, a high-performance computational ecosystem for robot learning. Lyceum is built on top of the Julia programming language and the MuJoCo physics simulator, combining the ease-of-use of a high-level programming language with the performance of native C. In addition, Lyceum has a straightforward API to support parallel computation across multiple cores and machines. Overall, depending on the complexity of the environment, Lyceum is 5-30x faster compared to other popular abstractions like OpenAI's Gym and DeepMind's dm-control. This substantially reduces training time for various reinforcement learning algorithms; and is also fast enough to support real-time model predictive control through MuJoCo. The code, tutorials, and demonstration videos can be found at: www.lyceum.ml.

Aravind Rajeswaran | Emanuel Todorov | Siddhartha Srinivasa | Colin Summers | Kendall Lowrey

[1] Sergey Levine,et al. Optimal control with learned local models: Application to dexterous manipulation , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[2] James M. Rehg,et al. Aggressive driving with model predictive path integral control , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[3] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[4] Emanuel Todorov,et al. Ensemble-CIO: Full-body dynamic motion planning that transfers to physical humanoids , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5] Sham M. Kakade,et al. Towards Generalization and Simplicity in Continuous Control , 2017, NIPS.

[6] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[7] Pieter Abbeel,et al. DoorGym: A Scalable Door Opening Environment And Baseline Agent , 2019, ArXiv.

[8] Geoffrey A. Hollinger,et al. HERB: a home exploring robotic butler , 2010, Auton. Robots.

[9] Silvio Savarese,et al. SURREAL: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark , 2018, CoRL.

[10] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[11] Larry Rudolph,et al. Are Deep Policy Gradient Algorithms Truly Policy Gradient Algorithms? , 2018, ArXiv.

[12] Vikash Kumar,et al. Manipulators and Manipulation in high dimensional spaces , 2016 .

[13] Siddhartha S. Srinivasa,et al. DART: Dynamic Animation and Robotics Toolkit , 2018, J. Open Source Softw..

[14] Vikash Kumar,et al. Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real , 2019, CoRL.

[15] Siu Kwan Lam,et al. Numba: a LLVM-based Python JIT compiler , 2015, LLVM '15.

[16] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.

[17] Sham M. Kakade,et al. Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control , 2018, ICLR.

[18] Wojciech Zaremba,et al. Domain Randomization and Generative Models for Robotic Grasping , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19] Michael Innes,et al. Fashionable Modelling with Flux , 2018, ArXiv.

[20] Ashish Agarwal,et al. TensorFlow Eager: A Multi-Stage, Python-Embedded DSL for Machine Learning , 2019, SysML.

[21] Balaraman Ravindran,et al. EPOpt: Learning Robust Neural Network Policies Using Model Ensembles , 2016, ICLR.

[22] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.

[23] Michael Innes,et al. Don't Unroll Adjoint: Differentiating SSA-Form Programs , 2018, ArXiv.

[24] Aravind Rajeswaran,et al. Learning Deep Visuomotor Policies for Dexterous Hand Manipulation , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[25] Twan Koolen,et al. Julia for robotics: simulation and real-time control in a high-level programming language , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[26] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.

[27] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.

[28] Henry Zhu,et al. ROBEL: Robotics Benchmarks for Learning with Low-Cost Robots , 2019, CoRL.

[29] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.

[31] Emanuel Todorov,et al. Reinforcement learning for non-prehensile manipulation: Transfer from simulation to physical system , 2018, 2018 IEEE International Conference on Simulation, Modeling, and Programming for Autonomous Robots (SIMPAR).

[32] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[33] Sergey Levine,et al. Deep Dynamics Models for Learning Dexterous Manipulation , 2019, CoRL.

[34] Benjamin Recht,et al. Simple random search of static linear policies is competitive for reinforcement learning , 2018, NeurIPS.

[35] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[36] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.

[37] E. Todorov,et al. A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[38] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..

[39] Alan Edelman,et al. Julia: A Fresh Approach to Numerical Computing , 2014, SIAM Rev..

[40] Sergey Levine,et al. (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.