Policy Transfer in Reinforcement Learning: A Selective Exploration Approach