Collaborative Transformers for Grounded Situation Recognition