Improving the Community Question Retrieval Performance Using Attention-Based Siamese LSTM

In this paper, we focus on the problem of question retrieval in community Question Answering (cQA) which aims to retrieve from the community archives the previous questions that are semantically equivalent to the new queries. The major challenges in this crucial task are the shortness of the questions as well as the word mismatch problem as users can formulate the same query using different wording. While numerous attempts have been made to address this problem, most existing methods relied on supervised models which significantly depend on large training data sets and manual feature engineering. Such methods are mostly constrained by their specificities that put aside the word order and ignore syntactic and semantic relationships. In this work, we rely on Neural Networks (NNs) which can learn rich dense representations of text data and enable the prediction of the textual similarity between the community questions. We propose a deep learning approach based on a Siamese architecture with LSTM networks, augmented with an attention mechanism. We test different similarity measures to predict the semantic similarity between the community questions. Experiments conducted on real cQA data sets in English and Arabic show that the performance of question retrieval is improved as compared to other competitive methods.

[1]  Ming Li,et al.  Learning Question Similarity with Recurrent Neural Networks , 2017, 2017 IEEE International Conference on Big Knowledge (ICBK).

[2]  Li Cai,et al.  Phrase-Based Translation Model for Question Retrieval in Community Question Answer Archives , 2011, ACL.

[3]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[4]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[5]  Amit Singh Entity based Q&A Retrieval , 2012, EMNLP-CoNLL.

[6]  Manoj Chinnakotla,et al.  Siamese LSTM with Convolutional Similarity for Similar Question Retrieval , 2018, 2018 International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP).

[7]  Cícero Nogueira dos Santos,et al.  Learning Hybrid Representations to Retrieve Semantically Equivalent Questions , 2015, ACL.

[8]  Fang Liu,et al.  Improving Question Retrieval in Community Question Answering Using World Knowledge , 2013, IJCAI.

[9]  Zhoujun Li,et al.  Question Retrieval with High Quality Answers in Community Question Answering , 2014, CIKM.

[10]  Tat-Seng Chua,et al.  Capturing the Semantics of Key Phrases Using Multiple Languages for Question Retrieval , 2016, IEEE Transactions on Knowledge and Data Engineering.

[11]  Tamer Elsayed,et al.  QU-IR at SemEval 2016 Task 3: Learning to Rank on Arabic Community Question Answering Forums with Word Embedding , 2016, *SEMEVAL.

[12]  Alberto Barrón-Cedeño,et al.  Selecting Sentences versus Selecting Tree Constituents for Automatic Question Ranking , 2016, COLING.

[13]  W. Bruce Croft,et al.  Retrieval models for question and answer archives , 2008, SIGIR '08.

[14]  Kai Wang,et al.  A syntactic tree matching approach to finding similar questions in community-based qa services , 2009, SIGIR.

[15]  Christian S. Jensen,et al.  A generalized framework of exploring category information for question retrieval in community question answer archives , 2010, WWW '10.

[16]  Po Hu,et al.  Learning Continuous Word Embedding with Metadata for Question Retrieval in Community Question Answering , 2015, ACL.

[17]  Jonas Mueller,et al.  Siamese Recurrent Architectures for Learning Sentence Similarity , 2016, AAAI.

[18]  Yonatan Belinkov,et al.  SLS at SemEval-2016 Task 3: Neural-based Approaches for Ranking in Community Question Answering , 2016, *SEMEVAL.

[19]  Christian S. Jensen,et al.  The use of categorization information in language models for question retrieval , 2009, CIKM.

[20]  Yonatan Belinkov,et al.  Language processing and learning models for community question answering in Arabic , 2017, Inf. Process. Manag..

[21]  Yonatan Belinkov,et al.  Neural Attention for Learning to Rank Questions in Community Question Answering , 2016, COLING.

[22]  Kamel Smaïli,et al.  Enhancing Question Retrieval in Community Question Answering Using Word Embeddings , 2019, KES.