xMoCo: Cross Momentum Contrastive Learning for Open-Domain Question Answering

Dense passage retrieval has been shown to be an effective approach for information retrieval tasks such as open domain question answering. Under this paradigm, a dual-encoder model is learned to encode questions and passages separately into vector representations, and all the passage vectors are then pre-computed and indexed, which can be efficiently retrieved by vector space search during inference time. In this paper, we propose a new contrastive learning method called cross momentum contrastive learning (xMoCo), for learning a dualencoder model for query-passage matching. Our method efficiently maintains a large pool of negative samples like the original MoCo, and by jointly optimizing question-to-passage and passage-to-question matching, enables using separate encoders for questions and passages. We evaluate our method on various open domain QA datasets, and the experimental results show the effectiveness of the proposed approach.

[1]  Danqi Chen,et al.  Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[2]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[3]  Ye Li,et al.  Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval , 2020, ArXiv.

[4]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[5]  Ping Li,et al.  Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS) , 2014, NIPS.

[6]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[7]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[8]  Ming Zhou,et al.  InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training , 2021, NAACL.

[9]  Jeff Johnson,et al.  Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[10]  Ming-Wei Chang,et al.  Latent Retrieval for Weakly Supervised Open Domain Question Answering , 2019, ACL.

[11]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[12]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[13]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[14]  Eunsol Choi,et al.  TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[15]  Petr Baudis,et al.  Modeling of the Question Answering Task in the YodaQA System , 2015, CLEF.

[16]  Hua Wu,et al.  RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering , 2020, NAACL.

[17]  Ming Zhou,et al.  Gated Self-Matching Networks for Reading Comprehension and Question Answering , 2017, ACL.

[18]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[20]  Ming-Wei Chang,et al.  Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[21]  Jimmy J. Lin,et al.  Document Expansion by Query Prediction , 2019, ArXiv.

[22]  Kaiming He,et al.  Improved Baselines with Momentum Contrastive Learning , 2020, ArXiv.