A Hybrid Model for Paraphrase Detection Combines pros of Text Similarity with Deep Learning

Paraphrase detection (PD) is a very essential and important task in Natural language processing. The goal of paraphrase detection is to check whether two statements written in natural language have the identical semantic or not. Its importance appears in many fields like plagiarism detection, question answering, document clustering and information retrieval, etc. This paper proposes a hybrid model that combines the text similarity approach with deep learning approach in order to improve paraphrase detection. This model verified results with Microsoft Research Paraphrase Corpus (MSPR) dataset, shows that accuracy measure is about 76.6% and F-measure is about 83.5%.

[1]  Eiichiro Sumita,et al.  Using Machine Translation Evaluation Techniques to Determine Sentence-level Semantic Equivalence , 2005, IJCNLP.

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Rada Mihalcea,et al.  Measuring semantic relatedness using salient encyclopedic concepts , 2011 .

[4]  Arthur C. Graesser,et al.  Paraphrase Identification with Lexico-Syntactic Graph Subsumption , 2008, FLAIRS.

[5]  Carlo Strapparava,et al.  Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[6]  Nitin Madnani,et al.  Re-examining Machine Translation Metrics for Paraphrase Identification , 2012, NAACL.

[7]  Jeffrey Pennington,et al.  Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.

[8]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[9]  Jimmy J. Lin,et al.  Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks , 2015, EMNLP.

[10]  Mirella Lapata,et al.  A Comparison of Vector-based Representations for Semantic Composition , 2012, EMNLP.

[11]  Zhiguo Wang,et al.  Sentence Similarity Learning by Lexical Decomposition and Composition , 2016, COLING.

[12]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[13]  Wael Hassan Gomaa,et al.  A Survey of Text Similarity Approaches , 2013 .

[14]  Samuel Fernando,et al.  A Semantic Similarity Approach to Paraphrase Detection , 2008 .

[15]  Zornitsa Kozareva,et al.  Paraphrase Identification on the Basis of Supervised Machine Learning Techniques , 2006, FinTAL.

[16]  Jacob Eisenstein,et al.  Discriminative Improvements to Distributional Sentence Similarity , 2013, EMNLP.

[17]  Alessandro Moschitti,et al.  Structural Representations for Learning Relations between Pairs of Texts , 2015, ACL.

[18]  Stephen Wan,et al.  Using Dependency-Based Features to Take the ’Para-farce’ out of Paraphrase , 2006, ALTA.

[19]  Zia Ul-Qayyum,et al.  Paraphrase Identification using Semantic Heuristic Features , 2012 .

[20]  Dimitri Kartsaklis,et al.  Evaluating Neural Word Representations in Tensor-Based Compositional Settings , 2014, EMNLP.

[21]  Aminul Islam,et al.  Semantic similarity of short texts , 2009 .

[22]  Dimitri Kartsaklis,et al.  Syntax-Aware Multi-Sense Word Embeddings for Deep Compositional Models of Meaning , 2015, EMNLP.

[23]  Tat-Seng Chua,et al.  Paraphrase Recognition via Dissimilarity Significance Classification , 2006, EMNLP.

[24]  I Mohamed,et al.  Exploring the Recent Trends of Paraphrase Detection , 2019, International Journal of Computer Applications.