Cross Lingual Video and Text Retrieval: A New Benchmark Dataset and Algorithm