Audio-visual Interaction in Model Adaptation for Multi-modal Speech Recognition