Application of Learning Trajectory Preference Algorithm Based on On-Line Iterative Feedback in Robots

For the unstructured environment, the definition standard of the robot’s trajectory varies with the user, task and environment. The trajectory of the robot is no longer a simple geometric constraint trajectory. It is necessary to study the trajectory preference problem of the robot in unstructured circumstance. Therefore, we put forward an active on-line learning skeleton frame for guiding robots to gradually understand the user’s preferences for objects to be manipulated in scenario-rich environments. Users do not need to obtain the best trajectory as training data to train the robot. Our method is different from the conventional LfD method. We only need to provide the feedback type that the user expects to improve the current trajectory slightly on-line. Approaching the optimal trajectory in an iterative manner. In this paper, we apply this algorithm to the high degree of freedom robots, and compared with several common algorithms, it can obtain more ideal trajectory when environment and task change.