Training Curricula for Open Domain Answer Re-Ranking

In precision-oriented tasks like answer ranking, it is more important to rank many relevant answers highly than to retrieve all relevant answers. It follows that a good ranking strategy would be to learn how to identify the easiest correct answers first (i.e., assign a high ranking score to answers that have characteristics that usually indicate relevance, and a low ranking score to those with characteristics that do not), before incorporating more complex logic to handle difficult cases (e.g., semantic matching or reasoning). In this work, we apply this idea to the training of neural answer rankers using curriculum learning. We propose several heuristics to estimate the difficulty of a given training sample. We show that the proposed heuristics can be used to build a training curriculum that down-weights difficult samples early in the training process. As the training process progresses, our approach gradually shifts to weighting all samples equally, regardless of difficulty. We present a comprehensive evaluation of our proposed idea on three answer ranking datasets. Results show that our approach leads to superior performance of two leading neural ranking architectures, namely BERT and ConvKNRM, using both pointwise and pairwise losses. When applied to a BERT-based ranker, our method yields up to a 4% improvement in MRR and a 9% improvement in P@1 (compared to the model trained without a curriculum). This results in models that can achieve comparable performance to more expensive state-of-the-art techniques.

[1]  Shiguang Shan,et al.  Self-Paced Learning with Diversity , 2014, NIPS.

[2]  Shiguang Shan,et al.  Self-Paced Curriculum Learning , 2015, AAAI.

[3]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[4]  Ophir Frieder,et al.  Overcoming low-utility facets for complex answer retrieval , 2018, Information Retrieval Journal.

[5]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[6]  Jimmy J. Lin,et al.  Anserini , 2018, Journal of Data and Information Quality.

[7]  Zhiyuan Liu,et al.  Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search , 2018, WSDM.

[8]  Claudia Hauff,et al.  Curriculum Learning Strategies for IR , 2019, ECIR.

[9]  Bhaskar Mitra,et al.  Neural Models for Information Retrieval , 2017, ArXiv.

[10]  Jimmy J. Lin,et al.  Critically Examining the "Neural Hype": Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models , 2019, SIGIR.

[11]  Xinlei Chen,et al.  Webly Supervised Learning of Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Kyunghyun Cho,et al.  Passage Re-ranking with BERT , 2019, ArXiv.

[13]  W. Bruce Croft,et al.  ANTIQUE: A Non-factoid Question Answering Benchmark , 2019, ECIR.

[14]  Bernhard Schölkopf,et al.  Fidelity-Weighted Learning , 2017, ICLR.

[15]  Xueqi Cheng,et al.  A Study of MatchPyramid Models on Ad-hoc Retrieval , 2016, ArXiv.

[16]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[17]  Jimmy J. Lin,et al.  The Neural Hype and Comparisons Against Weak Baselines , 2019, SIGIR Forum.

[18]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[19]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[20]  Thomas F. Coleman,et al.  Parallel continuation-based global optimization for molecular conformation and protein folding , 1994, J. Glob. Optim..

[21]  Jimmy J. Lin,et al.  Document Expansion by Query Prediction , 2019, ArXiv.

[22]  Jiawei Han,et al.  Curriculum Learning for Heterogeneous Star Network Embedding via Deep Reinforcement Learning , 2018, WSDM.

[23]  Jimmy J. Lin,et al.  The Impact of Score Ties on Repeatability in Document Ranking , 2019, SIGIR.

[24]  Xueqi Cheng,et al.  DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval , 2017, CIKM.

[25]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[26]  Bhaskar Mitra,et al.  Overview of the TREC 2019 deep learning track , 2020, ArXiv.

[27]  Eric P. Xing,et al.  Easy Questions First? A Case Study on Curriculum Learning for Question Answering , 2016, ACL.

[28]  Raffaele Perego,et al.  Continuation Methods and Curriculum Learning for Learning to Rank , 2018, CIKM.

[29]  Tao Yang,et al.  Efficient Interaction-based Neural Ranking with Locality Sensitive Hashing , 2019, WWW.

[30]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[31]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[32]  Jianfeng Gao,et al.  A Human Generated MAchine Reading COmprehension Dataset , 2018 .

[33]  Yelong Shen,et al.  Learning semantic representations using convolutional neural networks for web search , 2014, WWW.

[34]  Filip Radlinski,et al.  TREC Complex Answer Retrieval Overview , 2018, TREC.

[35]  Gerard de Melo,et al.  Co-PACRR: A Context-Aware Neural IR Model for Ad-hoc Retrieval , 2017, WSDM.

[36]  W. Bruce Croft,et al.  A Markov random field model for term dependencies , 2005, SIGIR '05.

[37]  Zhiyuan Liu,et al.  End-to-End Neural Ad-hoc Ranking with Kernel Pooling , 2017, SIGIR.

[38]  Nazli Goharian,et al.  CEDR: Contextualized Embeddings for Document Ranking , 2019, SIGIR.

[39]  Ophir Frieder,et al.  Characterizing Question Facets for Complex Answer Retrieval , 2018, SIGIR.