Multi-agent deep reinforcement learning for end—edge orchestrated resource allocation in industrial wireless networks