Scene graph semantic inference for image and text matching