Siamese LSTM with Convolutional Similarity for Similar Question Retrieval

In this paper, we model the similar question retrieval task as a binary classification problem. We propose a novel approach of “ID-Siamese LSTM for cQA (1D-SLcQA)” to find the semantic similarity between a new question and existing question(s). In 1D-SLcQA, we use a combination of twin LSTM networks and a contrastive loss function to effectively memorize the long term dependencies i.e., capture semantic similarity even when the length of the answers/questions is very large (200 words). The similarity of the questions is modeled using a single network with (1D) (feature) convolution between feature vectors learned from twin LSTM layers. Experiments on large scale real world Yahoo Answers dataset show that 1D-SLcQA outperform the state of the art approach of Siamese cQA approach(SCQA).

[1]  Arpita Das,et al.  Mirror on the Wall: Finding Similar Questions with Deep Structured Topic Modeling , 2016, PAKDD.

[2]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[3]  Xuanjing Huang,et al.  Convolutional Neural Tensor Network Architecture for Community-Based Question Answering , 2015, IJCAI.

[4]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[5]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[6]  Zhoujun Li,et al.  Question Retrieval with High Quality Answers in Community Question Answering , 2014, CIKM.

[7]  W. Bruce Croft,et al.  Finding similar questions in large question and answer archives , 2005, CIKM '05.

[8]  Po Hu,et al.  Learning Continuous Word Embedding with Metadata for Question Retrieval in Community Question Answering , 2015, ACL.

[9]  Li Cai,et al.  Learning the Latent Topics for Question Retrieval in Community QA , 2011, IJCNLP.

[10]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[11]  Arpita Das,et al.  Together we stand: Siamese Networks for Similar Question Retrieval , 2016, ACL.

[12]  Irwin King,et al.  Routing questions to appropriate answerers in community question answering services , 2010, CIKM.

[13]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[14]  W. Bruce Croft,et al.  Retrieval models for question and answer archives , 2008, SIGIR '08.

[15]  Peng Zhou,et al.  Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling , 2016, COLING.

[16]  Li Cai,et al.  Phrase-Based Translation Model for Question Retrieval in Community Question Answer Archives , 2011, ACL.

[17]  Zhoujun Li,et al.  Learning Distributed Representations of Data in Community Question Answering for Question Retrieval , 2016, WSDM.

[18]  Jonas Mueller,et al.  Siamese Recurrent Architectures for Learning Sentence Similarity , 2016, AAAI.

[19]  Tingting He,et al.  Learning semantic representation with neural networks for community question answering retrieval , 2016, Knowl. Based Syst..

[20]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[21]  Ben He,et al.  Question-answer topic model for question retrieval in community question answering , 2012, CIKM.