A Multi-Agent Reinforcement Learning Method With Route Recorders for Vehicle Routing in Supply Chain Management