论文信息 - A Semantic Similarity Computing Model based on Siamese Network for Duplicate Questions Identification

A Semantic Similarity Computing Model based on Siamese Network for Duplicate Questions Identification

Traditional semantic similarity computing methods mostly regard the text as a set of words, by calculating the number of words occurred in the text to build the feature vector, then using the metrics such as cosine distance between the vectors to calculate the text similarity. However, these methods only consider the word level of the sentence, not the semantic level, which may ignore many important information, including syntax and word order. This paper proposes a new deep learning method, which combines the attention mechanism with BiLSTM based on Siamese network to achieve the semantic similarity matching for given question pairs. Experimental results show that our models can make full use of the semantic information of the text, and the F1 value in the dataset provided by the CCKS2018 question-intention matching task is 0.84586, achieving fourth place in the final test.

Zhengqiu He | Wenliang Chen | Ziyi Tang | Baohui Wang | Zongkui Zhu

[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[2] Zhen-Hua Ling,et al. Enhanced LSTM for Natural Language Inference , 2016, ACL.

[3] Xiaoli Z. Fern,et al. DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference , 2018, NAACL.

[4] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[5] Holger Schwenk,et al. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[6] Maarten Versteegh,et al. Learning Text Similarity with Siamese Recurrent Networks , 2016, Rep4NLP@ACL.