DeepPurple: Estimating Sentence Semantic Similarity using N-gram Regression Models and Web Snippets

We estimate the semantic similarity between two sentences using regression models with features: 1) n-gram hit rates (lexical matches) between sentences, 2) lexical semantic similarity between non-matching words, and 3) sentence length. Lexical semantic similarity is computed via co-occurrence counts on a corpus harvested from the web using a modified mutual information metric. State-of-the-art results are obtained for semantic similarity computation at the word level, however, the fusion of this information at the sentence level provides only moderate improvement on Task 6 of SemEval'12. Despite the simple features used, regression models provide good performance, especially for shorter sentences, reaching correlation of 0.62 on the SemEval test set.

[1]  Eiichiro Sumita,et al.  Using Machine Translation Evaluation Techniques to Determine Sentence-level Semantic Equivalence , 2005, IJCNLP.

[2]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[3]  Ion Androutsopoulos,et al.  Learning Textual Entailment using SVMs and String Similarity Measures , 2007, ACL-PASCAL@ACL.

[4]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[5]  Danushka Bollegala,et al.  Measuring semantic similarity between words using web search engines , 2007, WWW '07.

[6]  Nitin Madnani,et al.  Generating Phrasal and Sentential Paraphrases: A Survey of Data-Driven Methods , 2010, CL.

[7]  Alexandros Potamianos,et al.  SemSim: Resources for Normalized Semantic Similarity Computation Using Lexical Networks , 2012, LREC.

[8]  M. Pennacchiotti,et al.  A machine learning approach to textual entailment recognition , 2009, Natural Language Engineering.

[9]  Prodromos Malakasiotis,et al.  Paraphrase Recognition Using Machine Learning to Combine Similarity Measures , 2009, ACL.

[10]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[11]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[12]  Alessandro Lenci,et al.  Distributional Memory: A General Framework for Corpus-Based Semantics , 2010, CL.

[13]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[14]  Sanda M. Harabagiu,et al.  Methods for Using Textual Entailment in Open-Domain Question Answering , 2006, ACL.

[15]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[16]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[17]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[18]  Enrique Alfonseca,et al.  Application of the Bleu algorithm for recognising textual entailments , 2005 .

[19]  Helen M. Meng,et al.  Semiautomatic Acquisition of Semantic Structures for Understanding Domain-Specific Natural Language Queries , 2002, IEEE Trans. Knowl. Data Eng..

[20]  Ido Dagan,et al.  Learning Entailment Rules for Unary Templates , 2008, COLING.

[21]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[22]  Johan Bos,et al.  Recognising Textual Entailment with Logical Inference , 2005, HLT.

[23]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[24]  Alexandros Potamianos,et al.  Unsupervised Semantic Similarity Computation between Terms Using Web Documents , 2010, IEEE Transactions on Knowledge and Data Engineering.

[25]  Lucia Specia,et al.  Source-Language Entailment Modeling for Translating Unknown Terms , 2009, ACL.

[26]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[27]  Ted Pedersen,et al.  Using WordNet-based Context Vectors to Estimate the Semantic Relatedness of Concepts , 2006 .

[28]  Fabio Rinaldi,et al.  Exploiting Paraphrases in a Question Answering System , 2003, IWP@ACL.

[29]  Eneko Agirre,et al.  Word Sense Disambiguation: Algorithms and Applications , 2007 .

[30]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[31]  Shrikanth S. Narayanan,et al.  Kernel Models for Affective Lexicon Creation , 2011, INTERSPEECH.