A Framework for Multi-modal Learning: Jointly Modeling Inter- & Intra-Modality Dependencies