Deep reinforcement learning in finite-horizon to explore the most probable transition pathway