Evaluation of Sentence Embedding Models for Natural Language Understanding Problems in Russian

We investigate the performance of sentence embeddings models on several tasks for the Russian language. In our comparison, we include such tasks as multiple choice question answering, next sentence prediction, and paraphrase identification. We employ FastText embeddings as a baseline and compare it to ELMo and BERT embeddings. We conduct two series of experiments, using both unsupervised (i.e., based on similarity measure only) and supervised approaches for the tasks. Finally, we present datasets for multiple choice question answering and next sentence prediction in Russian.

[1]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[2]  Andrey Kutuzov,et al.  WebVectors: A Toolkit for Building Web Interfaces for Vector Semantic Models , 2016, AIST.

[3]  Douwe Kiela,et al.  SentEval: An Evaluation Toolkit for Universal Sentence Representations , 2018, LREC.

[4]  Iryna Gurevych,et al.  Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations , 2018, 1803.01400.

[5]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[6]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[7]  Grigori Sidorov,et al.  Unsupervised Sentence Representations as Word Information Series: Revisiting TF-IDF , 2017, Comput. Speech Lang..

[8]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[9]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[10]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[11]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[12]  Anna Veronika Dorogush,et al.  CatBoost: unbiased boosting with categorical features , 2017, NeurIPS.

[13]  Eunsol Choi,et al.  TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[14]  Amir Bakarov,et al.  A Survey of Word Embeddings Evaluation Methods , 2018, ArXiv.

[15]  Nan Hua,et al.  Universal Sentence Encoder , 2018, ArXiv.