A Reinforcement Learning Approach for Electric Vehicle Routing Problem with Vehicle-to-Grid Supply