Initialization in reinforcement learning for mobile robots path planning

To improve the convergence rate of the standard Q-learning algorithm,we propose an initialization method for the reinforcement learning of the mobile robot,based on the artificial potential field(APF)-a virtue field of the robot workspace.The potential energy of each point in the field is specified based on prior knowledge,which represents the maximum cumulative reward by following the optimal path policy.In APF,points corresponding to obstacles have null potential energy;the objective point has the global maximum potential energy in the workspace.The initial Q value is defined as the immediate reward at the current point plus the maximum cumulative reward at succeeding points by following the optimal path policy.By initializing the Q value,we find that the improved algorithm converges more rapidly and steadily than the original algorithm.The proposed algorithm is validated by the robot path in the grid workspace.Results of experiments show that the improved algorithm promotes the learning efficiency in the early stage of learning,and improves the performance.