论文信息 - Utilizing Bidirectional Encoder Representations from Transformers for Answer Selection - 字舞流文

Utilizing Bidirectional Encoder Representations from Transformers for Answer Selection

Pre-training a transformer-based model for the language modeling task in a large dataset and then fine-tuning it for downstream tasks has been found very useful in recent years. One major advantage of such pre-trained language models is that they can effectively absorb the context of each word in a sentence. However, for tasks such as the answer selection task, the pre-trained language models have not been extensively used yet. To investigate their effectiveness in such tasks, in this paper, we adopt the pre-trained Bidirectional Encoder Representations from Transformer (BERT) language model and fine-tune it on two Question Answering (QA) datasets and three Community Question Answering (CQA) datasets for the answer selection task. We find that fine-tuning the BERT model for the answer selection task is very effective and observe a maximum improvement of 13.1% in the QA datasets and 18.7% in the CQA datasets compared to the previous state-of-the-art.

Jimmy Xiangji Huang | Enamul Hoque | Md Tahmid Rahman Laskar | Enamul Hoque | J. Huang

[1] Shafiq R. Joty,et al. Zero-Resource Cross-Lingual Named Entity Recognition , 2019, AAAI.

[2] Qinmin Hu,et al. CA-RNN: Using Context-Aligned Recurrent Neural Networks for Modeling Sentence Similarity , 2018, AAAI.

[3] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[4] Xiaohui Yu,et al. Modeling and Predicting the Helpfulness of Online Reviews , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[5] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[6] Qinmin Hu,et al. Enhancing Recurrent Neural Networks with Positional Attention for Question Answering , 2017, SIGIR.

[7] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8] Preslav Nakov,et al. SemEval-2017 Task 3: Community Question Answering , 2017, *SEMEVAL.

[9] Maneesh Agrawala,et al. Answering Questions about Charts and Generating Visual Explanations , 2020, CHI.

[10] Dan Roth,et al. Robust Named Entity Recognition with Truecasing Pretraining , 2020, AAAI.

[11] Xiangji Huang,et al. Proximity-based rocchio's model for pseudo relevance , 2012, SIGIR '12.

[12] Stephen E. Robertson,et al. Applying Machine Learning to Text Segmentation for Information Retrieval , 2004, Information Retrieval.

[13] Ming-Wei Chang,et al. Question Answering Using Enhanced Lexical Semantic Models , 2013, ACL.

[14] Jimmy Xiangji Huang,et al. WSL-DS: Weakly Supervised Learning with Distant Supervision for Query Focused Multi-Document Abstractive Summarization , 2020, COLING.

[15] Yue Ma,et al. Predicting and Integrating Expected Answer Types into a Simple Recurrent Neural Network Model for Answer Sentence Selection , 2019, Computación y Sistemas.

[16] Noah A. Smith,et al. What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA , 2007, EMNLP.

[17] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[18] Jimmy J. Lin,et al. Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks , 2016, CIKM.

[19] Xiangji Huang,et al. Boosting Prediction Accuracy on Imbalanced Datasets with SVM Ensembles , 2006, PAKDD.

[20] Qinmin Hu,et al. CAN: Enhancing Sentence Similarity Modeling with Collaborative and Adversarial Network , 2018, SIGIR.

[21] Siu Cheung Hui,et al. Hyperbolic Representation Learning for Fast and Efficient Neural Question Answering , 2017, WSDM.

[22] Xiangji Huang,et al. Contextualized Embeddings based Transformer Encoder for Sentence Similarity Modeling in Answer Selection Task , 2020, LREC.

[23] Zhoujun Li,et al. A Survival Modeling Approach to Biomedical Search Result Diversification Using Wikipedia , 2013, IEEE Trans. Knowl. Data Eng..

[24] Alessandro Moschitti,et al. Automatic Feature Engineering for Answer Selection and Extraction , 2013, EMNLP.

[25] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[26] Luo Si,et al. York University at TREC 2007: Genomics Track , 2005, TREC.

[27] Xiangji Huang,et al. Mining Online Reviews for Predicting Sales Performance: A Case Study in the Movie Domain , 2012, IEEE Transactions on Knowledge and Data Engineering.

[28] Lysandre Debut,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[29] Xiaohui Yu,et al. ARSA: a sentiment-aware model for predicting sales performance using blogs , 2007, SIGIR.

[30] Zhifang Sui,et al. A Multi-View Fusion Neural Network for Answer Selection , 2018, AAAI.

[31] Yi Yang,et al. WikiQA: A Challenge Dataset for Open-Domain Question Answering , 2015, EMNLP.

[32] Thomas Wolf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[33] Jimmy J. Lin,et al. Bridging the Gap between Relevance Matching and Semantic Matching for Short Text Similarity Modeling , 2019, EMNLP.

[34] Qinmin Hu,et al. A bayesian learning approach to promoting diversity in ranking for biomedical information retrieval , 2009, SIGIR.

[35] Siu Cheung Hui,et al. Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture , 2017, SIGIR.

[36] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[37] Xiangji Huang,et al. Query Focused Abstractive Summarization via Incorporating Query Relevance and Transfer Learning with Transformer Models , 2020, Canadian Conference on AI.

[38] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.