论文信息 - Deep Kernels for Optimizing Locomotion Controllers

Deep Kernels for Optimizing Locomotion Controllers

Sample efficiency is important when optimizing parameters of locomotion controllers, since hardware experiments are time consuming and expensive. Bayesian Optimization, a sample-efficient optimization framework, has recently been widely applied to address this problem, but further improvements in sample efficiency are needed for practical applicability to real-world robots and high-dimensional controllers. To address this, prior work has proposed using domain expertise for constructing custom distance metrics for locomotion. In this work we show how to learn such a distance metric automatically. We use a neural network to learn an informed distance metric from data obtained in high-fidelity simulations. We conduct experiments on two different controllers and robot architectures. First, we demonstrate improvement in sample efficiency when optimizing a 5-dimensional controller on the ATRIAS robot hardware. We then conduct simulation experiments to optimize a 16-dimensional controller for a 7-link robot model and obtain significant improvements even when optimizing in perturbed environments. This demonstrates that our approach is able to enhance sample efficiency for two different controllers, hence is a fitting candidate for further experiments on hardware in the future.

[1] Christopher G. Atkeson,et al. Optimization‐based Full Body Control for the DARPA Robotics Challenge , 2015, J. Field Robotics.

[2] Hartmut Geyer,et al. A Muscle-Reflex Model That Encodes Principles of Legged Mechanics Produces Human Walking Dynamics and Muscle Activities , 2010, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[3] Howie Choset,et al. Using response surfaces and expected improvement to optimize snake robot gait parameters , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4] Carl E. Rasmussen,et al. Manifold Gaussian Processes for regression , 2014, 2016 International Joint Conference on Neural Networks (IJCNN).

[5] Nando de Freitas,et al. Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[6] Seungmoon Song,et al. A neural circuitry that emphasizes spinal feedback generates diverse behaviours of human locomotion , 2015, The Journal of physiology.

[7] Hartmut Geyer,et al. Walking and Running with Passive Compliance: Lessons from Engineering: A Live Demonstration of the ATRIAS Biped , 2018, IEEE Robotics & Automation Magazine.

[8] Christopher K. I. Williams,et al. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[9] Christopher G. Atkeson,et al. Sample efficient optimization for learning controllers for bipedal locomotion , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[10] Alexander Herzog,et al. Momentum control with hierarchical inverse dynamics on a torque-controlled humanoid , 2014, Autonomous Robots.

[11] Glen Berseth,et al. Terrain-adaptive locomotion skills using deep reinforcement learning , 2016, ACM Trans. Graph..

[12] Albert Wu,et al. Robust spring mass model running for a physical bipedal robot , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[13] Jan Peters,et al. Bayesian optimization for learning gaits under uncertainty , 2015, Annals of Mathematics and Artificial Intelligence.

[14] Andreas Krause,et al. Virtual vs. real: Trading off simulations and physical experiments in reinforcement learning with Bayesian optimization , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[15] Albert Wu,et al. Experimental Evaluation of Deadbeat Running on the ATRIAS Biped , 2017, IEEE Robotics and Automation Letters.

[16] Marc H. Raibert,et al. Legged Robots That Balance , 1986, IEEE Expert.

[17] Antoine Cully,et al. Robots that can adapt like animals , 2014, Nature.

[18] Tao Wang,et al. Automatic Gait Optimization with Gaussian Process Regression , 2007, IJCAI.

[19] Alan Fern,et al. Using trajectory data to improve bayesian optimization for reinforcement learning , 2014, J. Mach. Learn. Res..

[20] Scott Kuindersma,et al. Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot , 2015, Autonomous Robots.