Self-supervised 3D Patient Modeling with Multi-modal Attentive Fusion