Reinforcement learning and A* search for the unit commitment problem