Distributed output formation tracking control of heterogeneous multi-agent systems using reinforcement learning.