Robot path planning in complex environment based on delayed-optimization reinforcement learning

In this paper, the delayed-optimization reinforcement learning (DORL) is proposed and applied to mobile robot control in a complex environment with multiple obstacles. The delayed optimization of the sub-optimal solutions is incorporated into the reinforcement-learning agent. Learning from global optimized control experience is enabled. In the experiments, the global optimal control strategy can be learned by DORL. Compared with the traditional reinforcement learning method, the DORL algorithm shows much better learning performance.