Target following for an autonomous underwater vehicle using regularized ELM-based reinforcement learning

In order to achieve autonomy and intelligence of autonomous underwater vehicle (AUV), dynamic target following in the unknown environment is one of the important problems to solve. Reinforcement learning (RL) offers the possibility of learning a policy to solve a particular task without manual intervention and previous experience. However, RL algorithms are not competent for continuous space problems existing universally in underwater environment. In this paper, the method of an AUV using regularized extreme learning machine (RELM) based RL is proposed for following the moving target. The most typical and common method of RL used in applications is Q learning, which can generate the proper policy for AUV control. And RELM is proposed as a modification of ELM, which can guarantee the generalization performance and work with continuous states and actions. The simulation results have shown that AUV successfully tracks and follows the moving target by the proposed method.

[1]  Zongben Xu,et al.  $L_{1/2}$ Regularization: A Thresholding Representation Theory and a Fast Solver , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[3]  D. Cecchi,et al.  Autonomous underwater vehicles for scientific and naval operations , 2004 .

[4]  Guang-Bin Huang,et al.  Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions , 1998, IEEE Trans. Neural Networks.

[5]  Nema Dean,et al.  Q-Learning: Flexible Learning About Useful Utilities , 2013, Statistics in Biosciences.

[6]  Sabiha Wadoo,et al.  Autonomous Underwater Vehicles , 2011 .

[7]  Minoru Asada,et al.  Continuous valued Q-learning for vision-guided behavior acquisition , 1999, Proceedings. 1999 IEEE/SICE/RSJ. International Conference on Multisensor Fusion and Integration for Intelligent Systems. MFI'99 (Cat. No.99TH8480).

[8]  Long Wang,et al.  Underwater target following with a vision-based autonomous robotic fish , 2009, 2009 American Control Conference.

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[11]  Monique Chyba,et al.  Autonomous underwater vehicles , 2009 .

[12]  Satyandra K. Gupta,et al.  Target following with motion prediction for unmanned surface vehicle operating in cluttered environments , 2014, Auton. Robots.

[13]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[14]  Vaghei Yasaman,et al.  Reinforcement Learning in Neural Networks: A Survey , 2014 .

[15]  B. Bett,et al.  Autonomous Underwater Vehicles (AUVs): Their past, present and future contributions to the advancement of marine geoscience , 2014 .

[16]  Junku Yuh,et al.  Application of SONQL for real-time learning of robot behaviors , 2007, Robotics Auton. Syst..

[17]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[18]  Yves Chauvin,et al.  Backpropagation: theory, architectures, and applications , 1995 .

[19]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[20]  Tsukasa Ogasawara,et al.  Continuous valued Q-learning method able to incrementally refine state space , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[21]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.