Q-learning control based on self-organizing RBF network

The radial basis function (RBF) neural network is used to approache the Q-value function. The information learnt is generalized by learning agent in continuous state space and action space. The input of RBF network is the pair of state and action,and the output is the Q-value of the pair of state and action. The state is decided by the transfer characteristic of system. The act of the input is consisted of the greedy act,which can be calculated with the Q-value optimization and noise act which has a normal distribution. The RNA algorithm and gradient decent algorithm are introduced to adjust the structure and parameters of network in a self-organization way. The results of simulation on the balancing control of a cart-pole system show the effectiveness of the proposed Q-learning method.