Multi-agent policy learning-based path planning for autonomous mobile robots