论文信息 - Time-Varying Formation Controllers for Unmanned Aerial Vehicles Using Deep Reinforcement Learning

Time-Varying Formation Controllers for Unmanned Aerial Vehicles Using Deep Reinforcement Learning

We consider the problem of designing scalable and portable controllers for unmanned aerial vehicles (UAVs) to reach time-varying formations as quickly as possible. This brief confirms that deep reinforcement learning can be used in a multi-agent fashion to drive UAVs to reach any formation while taking into account optimality and portability. We use a deep neural network to estimate how good a state is, so the agent can choose actions accordingly. The system is tested with different non-high-dimensional sensory inputs without any change in the neural network architecture, algorithm or hyperparameters, just with additional training.

Carlos Torre-Ferrero | Ronny Conde | José Ramón Llata

[1] Nikhil Nigam,et al. Control of Multiple UAVs for Persistent Surveillance: Algorithm and Flight Test Results , 2012, IEEE Transactions on Control Systems Technology.

[2] Hasan Mehrjerdi,et al. A survey on multiple unmanned vehicles formation control and coordination: Normal and fault situations , 2013, 2013 International Conference on Unmanned Aircraft Systems (ICUAS).

[3] Wang Rui,et al. Adaptive time-varying formation control for high-order LTI multi-agent systems , 2015, 2015 34th Chinese Control Conference (CCC).

[4] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[5] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[6] Hai Lin,et al. Hybrid three-dimensional formation control for unmanned helicopters , 2013, Autom..

[7] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .

[8] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[9] Sonia Waharte,et al. Coordinated Search with a Swarm of UAVs , 2009, 2009 6th IEEE Annual Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks Workshops.

[10] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[12] Yisheng Zhong,et al. Time-Varying Formation Control for Unmanned Aerial Vehicles: Theories and Applications , 2015, IEEE Transactions on Control Systems Technology.