Neural Networks with Online Sequential Learning Ability for a Reinforcement Learning Algorithm

Reinforcement learning (RL) algorithms that employ neural networks as function approximators have proven to be powerful tools for solving optimal control problems. However, neural network function approximators suffer from a number of problems like learning becomes difficult when the training data are given sequentially, difficult to determine structural parameters, and usually result in local minima or overfitting. In this paper, a novel on-line sequential learning evolving neural network model design for RL is proposed. We explore the use of minimal resource allocation neural network (mRAN), and develop a mRAN function approximation approach to RL systems. Potential of this approach is demonstrated through a case study. The mean square error accuracy, computational cost, and robustness properties of this scheme are compared with static structure neural networks.

[1]  Paramasivan Saratchandran,et al.  Performance evaluation of a sequential minimal radial basis function (RBF) neural network learning algorithm , 1998, IEEE Trans. Neural Networks.

[2]  Héctor Pomares,et al.  Time series analysis using normalized PG-RBF network with regression weights , 2002, Neurocomputing.

[3]  Richard S. Sutton,et al.  Reinforcement Learning is Direct Adaptive Optimal Control , 1992, 1991 American Control Conference.

[4]  John C. Platt A Resource-Allocating Network for Function Interpolation , 1991, Neural Computation.

[5]  Shigeo Abe,et al.  Reducing computations in incremental learning for feedforward neural network with long-term memory , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[6]  Visakan Kadirkamanathan,et al.  A Function Estimation Approach to Sequential Learning with Neural Networks , 1993, Neural Computation.

[7]  Ben J. A. Kröse,et al.  Neural Q-learning , 2003, Neural Computing & Applications.

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  Y Lu,et al.  A Sequential Learning Scheme for Function Approximation Using Minimal Radial Basis Function Neural Networks , 1997, Neural Computation.

[10]  Tommi S. Jaakkola,et al.  Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[11]  Shichao Zhang,et al.  AI 2005: Advances in Artificial Intelligence, 18th Australian Joint Conference on Artificial Intelligence, Sydney, Australia, December 5-9, 2005, Proceedings , 2005, Australian Conference on Artificial Intelligence.

[12]  Julio Ortega Lopera,et al.  Improved RAN sequential prediction using orthogonal techniques , 2001, Neurocomputing.

[13]  Narasimhan Sundararajan,et al.  An efficient sequential learning algorithm for growing and pruning RBF (GAP-RBF) networks , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  Anthony Green,et al.  Dynamics and Trajectory Tracking Control of a Two-Link Robot Manipulator , 2004 .

[15]  Tao Xiong,et al.  A combined SVM and LDA approach for classification , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[16]  Narasimhan Sundararajan,et al.  A generalized growing and pruning RBF (GGAP-RBF) neural network for function approximation , 2005, IEEE Transactions on Neural Networks.

[17]  Narasimhan Sundararajan,et al.  A Fast and Accurate Online Sequential Learning Algorithm for Feedforward Networks , 2006, IEEE Transactions on Neural Networks.

[18]  M. Gopal,et al.  A fuzzy decision tree-based robust Markov game controller for robot manipulators , 2010, Int. J. Autom. Control..

[19]  Peter Vamplew,et al.  Global Versus Local Constructive Function Approximation for On-Line Reinforcement Learning , 2005, Australian Conference on Artificial Intelligence.

[20]  Shigeo Abe,et al.  A reinforcement learning algorithm for neural networks with incremental learning ability , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..