论文信息 - Autonomous Planetary Landing via Deep Reinforcement Learning and Transfer Learning

Autonomous Planetary Landing via Deep Reinforcement Learning and Transfer Learning

The aim of this work is to develop an application for autonomous landing. We exploit the properties of Deep Reinforcement Learning and Transfer Learning, in order to tackle the problem of planetary landing on unknown or barely-known extra-terrestrial environments by learning good-performing policies, which are transferable from the training environment to other, new environments, without losing optimality. To this end, we model a real-physics simulator, by means of the Bullet/PyBullet library, composed by a lander, defined through the standard ROS/URDF framework and realistic 3D terrain models, for which we adapt official NASA 3D meshes, reconstructed from the data retrieved during missions. Where such model were not available, we reconstruct the terrain from mission imagery - generally SAR imagery. In this setup, we train a Deep Reinforcement Learning model - using DDPG - to autonomous land on the lunar environment. Moreover, we perform transfer learning on the Mars and Titan environment. While still preliminary, our results show that DDPG can learn a good landing policy, which can be transferred to other environments.

[1] Stergios I. Roumeliotis,et al. A General Approach to Terrain Relative Navigation for Planetary Landing , 2007 .

[2] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[3] Larry Matthies,et al. Integrated Simulation and State Estimation for Precision Landing on Titan , 2020, 2020 IEEE Aerospace Conference.

[4] Rosaly M. C. Lopes,et al. Titan's surface from Cassini RADAR SAR and high resolution radiometry data of the first five flybys , 2007 .

[5] A.E. Johnson,et al. Overview of Terrain Relative Navigation Approaches for Precise Lunar Landing , 2008, 2008 IEEE Aerospace Conference.

[6] Shoya Higa,et al. MAARS: Machine learning-based Analytics for Automated Rover Systems , 2020, 2020 IEEE Aerospace Conference.

[7] Kenneth E. Hibbard,et al. Dragonfly: A rotorcraft lander concept for scientific exploration at titan , 2018 .

[8] Roberto Furfaro,et al. Deep Reinforcement Learning for Six Degree-of-Freedom Planetary Powered Descent and Landing , 2018, ArXiv.

[9] R. Kirk,et al. The lakes of Titan , 2006, Nature.

[10] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[13] Francesco Topputo,et al. Deep Learning for Autonomous Lunar Landing , 2018 .

[14] Masahiro Ono,et al. Machine Learning Based Path Planning for Improved Rover Navigation , 2020, 2021 IEEE Aerospace Conference (50100).

[15] Martial Hebert,et al. Learning Transferable Policies for Monocular Reactive MAV Control , 2016, ISER.

[16] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.

[17] Robert A. Hewitt,et al. Terrain Relative Navigation for Guided Descent on Titan , 2020, 2020 IEEE Aerospace Conference.

[18] Mauro Massari,et al. Adaptive Generalized ZEM-ZEV Feedback Guidance for Planetary Landing via a Deep Reinforcement Learning Approach , 2020, ArXiv.

[19] Roberto Furfaro,et al. Image-based Deep Reinforcement Learning for Autonomous Lunar Landing , 2020 .

[20] R. Mazo. On the theory of brownian motion , 1973 .