论文信息 - Path planning for asteroid hopping rovers with pre-trained deep reinforcement learning architectures - 字舞流文

Path planning for asteroid hopping rovers with pre-trained deep reinforcement learning architectures

Abstract Asteroid surface exploration is challenging due to complex terrain topology and irregular gravity field. A hopping rover is considered as a promising mobility solution to explore the surface of small celestial bodies. Conventional path planning tasks, such as traversing a given map to reach a known target, may become particularly challenging for hopping rovers if the terrain displays sufficiently complex 3-D structures. As an alternative to traditional path-planning approaches, this work explores the possibility of applying deep reinforcement learning (DRL) to plan the path of a hopping rover across a highly irregular surface. The 3-D terrain of the asteroid surface is converted into a level matrix, which is used as an input of the reinforcement learning algorithm. A deep reinforcement learning architecture with good convergence and stability properties is presented to solve the rover path-planning problem. Numerical simulations are performed to validate the effectiveness and robustness of the proposed method with applications to two different types of 3-D terrains.

Xiangyuan Zeng | Davide Guzzetti | Yuyang You | Jianxun Jiang | Xiangyuan Zeng | D. Guzzetti | Jianxun Jiang | Yuyang You

[1] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[2] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.

[3] Richard Linares,et al. Seeker based Adaptive Guidance via Reinforcement Meta-Learning Applied to Asteroid Close Proximity Operations , 2019, ArXiv.

[4] Liuqing Yang,et al. Where does AlphaGo go: from church-turing thesis to AlphaGo thesis and beyond , 2016, IEEE/CAA Journal of Automatica Sinica.

[5] Richard P. Binzel,et al. MUSES‐C target asteroid (25143) 1998 SF36: A reddened ordinary chondrite , 2001 .

[6] WeiRen Wu,et al. Investigation on the development of deep space exploration , 2012 .

[7] Jianqiao Yu,et al. UAV path planning using artificial potential field method updated by optimal control theory , 2016, Int. J. Syst. Sci..

[8] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.

[9] Anish Pandey,et al. Path planning navigation of mobile robot with obstacles avoidance using fuzzy logic controller , 2014, 2014 IEEE 8th International Conference on Intelligent Systems and Control (ISCO).

[10] J. Mankins,et al. Toward a global space exploration program: A stepping stone approach , 2012 .

[11] Mohamed Elhoseny,et al. Bezier Curve Based Path Planning in a Dynamic Field using Modified Genetic Algorithm , 2017, J. Comput. Sci..

[12] Michael I. Jordan,et al. Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[13] H. Osmaston,et al. Estimates of glacier equilibrium line altitudes by the Area×Altitude, the Area×Altitude Balance Ratio and the Area×Altitude Balance Index methods and their validation , 2005 .

[14] Marc Peter Deisenroth,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[15] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[16] Takashi Kubota,et al. Micro-hopping robot for asteroid exploration , 2003 .

[17] Daniel J. Scheeres,et al. Surface Gravity Fields for Asteroids and Comets , 2013 .

[18] Valerio Carruba,et al. DETECTION OF THE YORP EFFECT FOR SMALL ASTEROIDS IN THE KARIN CLUSTER , 2016, 1603.09612.

[19] Dario Izzo,et al. A survey on artificial intelligence trends in spacecraft guidance dynamics and control , 2018, Astrodynamics.

[20] Michel Tokic,et al. Adaptive epsilon-Greedy Exploration in Reinforcement Learning Based on Value Difference , 2010, KI.

[21] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[22] Alma Y. Alanis,et al. Rough Terrain Perception Through Geometric Entities for Robot Navigation , 2013, CSE 2013.

[23] Junqiang Xi,et al. Self‐supervised learning to visually detect terrain surfaces for autonomous robots operating in forested terrain , 2012, J. Field Robotics.

[24] Frances Westall,et al. Astrobiology and the Possibility of Life on Earth and Elsewhere… , 2015, Space Science Reviews.

[25] Sergei Vassilvitskii,et al. Fast greedy algorithms in mapreduce and streaming , 2013, SPAA.

[26] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[27] Y. Tsuda,et al. Hayabusa2 Mission Overview , 2017 .

[28] M. Pavone,et al. Spacecraft/rover hybrids for the exploration of small Solar System bodies , 2013, 2013 IEEE Aerospace Conference.

[29] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[30] Thomas C. Duxbury,et al. Spacecraft exploration of Phobos and Deimos , 2014 .

[31] Chris Watkins,et al. Learning from delayed rewards , 1989 .

[32] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[33] J. Dotson,et al. A probabilistic asteroid impact risk model: assessment of sub-300 m impacts , 2017 .

[34] Quoc V. Le,et al. Neural Optimizer Search with Reinforcement Learning , 2017, ICML.

[35] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.