Parallel Exploration via Negatively Correlated Search
暂无分享,去创建一个
Effective exploration is a key to successful search. The recently proposed Negatively Correlated Search (NCS) tries to achieve this by parallel exploration, where a set of search processes are driven to be negatively correlated so that different promising areas of the search space can be visited simultaneously. Various applications have verified the advantages of such novel search behaviors. Nevertheless, the mathematical understandings are still lacking as the previous NCS was mostly devised by intuition. In this paper, a more principled NCS is presented, explaining that the parallel exploration is equivalent to the explicit maximization of both the population diversity and the population solution qualities, and can be optimally obtained by partially gradient descending both models with respect to each search process. For empirical assessments, the reinforcement learning tasks that largely demand exploration ability is considered. The new NCS is applied to the popular reinforcement learning problems, i.e., playing Atari games, to directly train a deep convolution network with 1.7 million connection weights in the environments with uncertain and delayed rewards. Empirical results show that the significant advantages of NCS over the compared state-of-the-art methods can be highly owed to the effective parallel exploration ability.