Forward-Looking Imaginative Planning Framework Combined with Prioritized-Replay Double DQN

Many machine learning systems are built to solve the toughest planning problems. Such systems usually adopt a one-size-fits-all approach to different planning problems. This can lead to a waste of precious computing resources in simple planning problems, while not investing enough in complex problems. This requires a new model framework that does not require the learning of a single, fixed strategy, but rather introduces a series of decision controllers to resolve various planning tasks through learning to build, predict, and evaluate plans. Therefore, we propose a forward-looking imaginative planning framework combined with Prioritized-Replay Double DQN, which is a model-based continuous decision controller that determines the number of iterations of the decision-making process to be run and the model to be negotiated in each iteration. Before any single unit action, it can imagine and select actions based on the current state, including advance imagination with limited steps, and evaluate it with its model-based imagination. All imagined actions or outcomes will be iteratively integrated into a “plan environment”, which can test alternative imagined actions and be able to flexibly use a learned policy in the previously imagined state. Basis on these, the prioritized replay mode is adopted to improve the sampling weight and the training efficiency, which will make the metadata obtain lower overall cost than the traditional fixed strategy method, including task loss and calculation cost.