Deep Reinforcement Learning-Based Accurate Control of Planetary Soft Landing

Planetary soft landing has been studied extensively due to its promising application prospects. In this paper, a soft landing control algorithm based on deep reinforcement learning (DRL) with good convergence property is proposed. First, the soft landing problem of the powered descent phase is formulated and the theoretical basis of Reinforcement Learning (RL) used in this paper is introduced. Second, to make it easier to converge, a reward function is designed to include process rewards like velocity tracking reward, solving the problem of sparse reward. Then, by including the fuel consumption penalty and constraints violation penalty, the lander can learn to achieve velocity tracking goal while saving fuel and keeping attitude angle within safe ranges. Then, simulations of training are carried out under the frameworks of Deep deterministic policy gradient (DDPG), Twin Delayed DDPG (TD3), and Soft Actor Critic (SAC), respectively, which are of the classical RL frameworks, and all converged. Finally, the trained policy is deployed into velocity tracking and soft landing experiments, results of which demonstrate the validity of the algorithm proposed.

[1]  S. Citron,et al.  A Terminal Guidance Technique for Lunar Landing , 1964 .

[2]  Chengchao Bai,et al.  Optimal Guidance for Planetary Landing in Hazardous Terrains , 2020, IEEE Transactions on Aerospace and Electronic Systems.

[3]  Francesco Topputo,et al.  A Recurrent Deep Architecture for Quasi-Optimal Feedback Guidance in Planetary Landing , 2018 .

[4]  Jason L. Speyer,et al.  Optimal reentry and plane-change trajectories , 1981 .

[5]  Zhang Ya-feng Optimal Design of Direct Soft-Landing Trajectory of Lunar Prospector , 2007 .

[6]  Ronald R. Sostaric Powered Descent Trajectory Guidance and Some Considerations for Human Lunar Landing , 2007 .

[7]  Marwan Qaid Mohammed,et al.  Review of Deep Reinforcement Learning-Based Object Grasping: Techniques, Open Challenges, and Recommendations , 2020, IEEE Access.

[8]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[9]  Francesco Topputo,et al.  Deep Learning for Autonomous Lunar Landing , 2018 .

[10]  Trajectory Optimization of Lunar Soft Landing Using Differential Evolution , 2021, 2021 IEEE Aerospace Conference (50100).

[11]  Dario Izzo,et al.  Real-time optimal control via Deep Neural Networks: study on landing problems , 2016, ArXiv.

[12]  Victor Talpaert,et al.  Deep Reinforcement Learning for Autonomous Driving: A Survey , 2020, IEEE Transactions on Intelligent Transportation Systems.

[13]  Cong Wang,et al.  Powered soft landing guidance method for launchers with non-cluster configured engines , 2021, Acta Astronautica.

[14]  Etienne Pellegrini,et al.  A multiple-shooting differential dynamic programming algorithm. Part 1: Theory , 2020 .

[15]  Tomás de Jesús Mateo Sanguino,et al.  50 years of rovers for planetary exploration: A retrospective review for future directions , 2017, Robotics Auton. Syst..

[16]  A. Tata,et al.  From vacuum to atmospheric pressure: A review of ambient ion soft landing , 2020 .

[17]  Doris Chandler,et al.  Development of the iterative guidance mode with its application to various vehicles and missions. , 1967 .

[18]  Behcet Acikmese,et al.  Convex programming approach to powered descent guidance for mars landing , 2007 .