Multimodal Graph Learning for Generative Tasks