论文信息 - Reinforcement Learning-Based Satellite Attitude Stabilization Method for Non-Cooperative Target Capturing

Reinforcement Learning-Based Satellite Attitude Stabilization Method for Non-Cooperative Target Capturing

When a satellite performs complex tasks such as discarding a payload or capturing a non-cooperative target, it will encounter sudden changes in the attitude and mass parameters, causing unstable flying and rolling of the satellite. In such circumstances, the change of the movement and mass characteristics are unpredictable. Thus, the traditional attitude control methods are unable to stabilize the satellite since they are dependent on the mass parameters of the controlled object. In this paper, we proposed a reinforcement learning method to re-stabilize the attitude of a satellite under such circumstances. Specifically, we discretize the continuous control torque, and build a neural network model that can output the discretized control torque to control the satellite. A dynamics simulation environment of the satellite is built, and the deep Q Network algorithm is then performed to train the neural network in this simulation environment. The reward of the training is the stabilization of the satellite. Simulation experiments illustrate that, with the iteration of training progresses, the neural network model gradually learned to re-stabilize the attitude of a satellite after unknown disturbance. As a contrast, the traditional PD (Proportion Differential) controller was unable to re-stabilize the satellite due to its dependence on the mass parameters. The proposed method adopts self-learning to control satellite attitudes, shows considerable intelligence and certain universality, and has a strong application potential for future intelligent control of satellites performing complex space tasks.

[1] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[2] Yan Zhibin. Acquisition-probability model of noncooperative maneuvering target detection in space , 2010 .

[3] Han-Lim Choi,et al. Convolutional neural network-based spacecraft attitude control for docking port alignment , 2017, 2017 14th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI).

[4] Shunan Wu,et al. Autonomous Pointing Control of a Large Satellite Antenna Subject to Parametric Uncertainty , 2017, Sensors.

[5] Dawar Khan,et al. Deep deformable Q-Network: an extension of deep Q-Network , 2017, WI.

[6] Zhonghe Jin,et al. Adaptive prediction backstepping attitude control for liquid-filled micro-satellite with flexible appendages , 2018, Acta Astronautica.

[7] Bin Liang,et al. Adaptive reaction null-space control of dual-arm space robot for post-capture of non-cooperative target , 2017, 2017 29th Chinese Control And Decision Conference (CCDC).

[8] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[9] Liu Xiangdong. Adaptive Sliding Mode Control of Flexible Spacecraft on Input Shaping , 2013 .

[10] Jun Sun,et al. Quasi-model free control for the post-capture operation of a non-cooperative target , 2018 .

[11] Aihua Zhang,et al. A attitude control method for spacecraft considering actuator constraint and dynamics based backstepping , 2016, 2016 Seventh International Conference on Intelligent Control and Information Processing (ICICIP).

[12] Brij N. Agrawal,et al. Adaptive control of uncertain Hamiltonian Multi-Input Multi-Output systems : With application to spacecraft control , 2009, 2008 American Control Conference.

[13] Vikram Kapila,et al. Adaptive Nonlinear Control of Multiple Spacecraft Formation Flying , 2000 .

[14] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.