Video-Based Cross-Modal Auxiliary Network for Multimodal Sentiment Analysis