VideoCLIP: A Cross-Attention Model for Fast Video-Text Retrieval Task with Image CLIP