论文信息 - A Hybrid Deep Learning Architecture for Paraphrase Identification

A Hybrid Deep Learning Architecture for Paraphrase Identification

The binary classification task of Paraphrase Identification (PI) is vital in the field of Natural Language Processing. The objective of this study is to propose an optimized Deep Learning architecture in combination with usage of word embedding technique for the classification of sentence pairs as paraphrases or not. For Paraphrase Identification task, this paper proposes a hybrid Deep Learning architecture aiming to capture as many features from the inputted sentences in natural language. The aim is to accurately classify whether the pair of sentences are paraphrases of each other or not. The importance of using an optimized word-embedding approach in combination with the proposed hybrid Deep Learning architecture is explained. This study also deals with the lack of the training data required to generate a robust Deep Learning model. The intention is to harness the memorizing power of Long Short Term Memory (LSTM) neural network and the feature extracting capability of Convolutional Neural Network (CNN) in combination with the optimized word-embedding approach which aims to capture wide-sentential contexts and word-order. The proposed model is compared with existing systems and it surpasses all the existing systems in the performance in terms of accuracy.

Anant V. Nimkar | Divesh R. Kubal

[1] Arthur C. Graesser,et al. Paraphrase Identification with Lexico-Syntactic Graph Subsumption , 2008, FLAIRS.

[2] Carlo Strapparava,et al. Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[3] Nitin Madnani,et al. Re-examining Machine Translation Metrics for Paraphrase Identification , 2012, NAACL.

[4] Xiang Zhang,et al. Convolution neural network based syntactic and semantic aware paraphrase identification , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[5] Samuel Fernando,et al. A Semantic Similarity Approach to Paraphrase Detection , 2008 .

[6] Xueqi Cheng,et al. Text Matching as Image Recognition , 2016, AAAI.

[7] Zornitsa Kozareva,et al. Paraphrase Identification on the Basis of Supervised Machine Learning Techniques , 2006, FinTAL.

[8] Aminul Islam,et al. Semantic similarity of short texts , 2009 .

[9] Ido Dagan,et al. context2vec: Learning Generic Context Embedding with Bidirectional LSTM , 2016, CoNLL.

[10] Wenpeng Yin,et al. Convolutional Neural Network for Paraphrase Identification , 2015, NAACL.

[11] Philipp Koehn,et al. Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation , 2010, WMT@ACL.