Contrastive Learning for Weakly Supervised Phrase Grounding