Multi-Agent Trust Region Policy Optimization