TeamZ: Measuring Semantic Textual Similarity for Spanish Using an Overlap-Based Approach

This paper presents an overlap-based approach using bag of words and the Spanish WordNet to solve the STS-Spanish subtask (STS-Es) of SemEval-2014 Task 10. Since bag of words is the most commonly used method to ascertain similarity, the performance is modest.

[1]  Jon Barwise,et al.  Model-theoretic semantics , 1989 .

[2]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[3]  Jean-Cédric Chappelier,et al.  Textual similarities based on a distributional approach , 1999, Proceedings. Tenth International Workshop on Database and Expert Systems Applications. DEXA 99.

[4]  Yuji Matsumoto MaltParser: A language-independent system for data-driven dependency parsing , 2005 .

[5]  Mehran Sahami,et al.  A web-based kernel function for measuring the similarity of short text snippets , 2006, WWW '06.

[6]  Zuhair Bandar,et al.  Sentence similarity based on semantic nets and corpus statistics , 2006, IEEE Transactions on Knowledge and Data Engineering.

[7]  Carlo Strapparava,et al.  Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[8]  Diana Inkpen,et al.  Semantic text similarity using corpus-based word similarity and string similarity , 2008, ACM Trans. Knowl. Discov. Data.

[9]  Rynson W. H. Lau,et al.  Knowledge and Data Engineering for e-Learning Special Issue of IEEE Transactions on Knowledge and Data Engineering , 2008 .

[10]  German Rigau,et al.  Multilingual Central Repository version 3 . 0 : upgrading a very large lexical knowledge base , 2011 .

[11]  Paul Portner,et al.  Semantics: An International Handbook of Natural Language Meaning , 2011 .

[12]  Claire Cardie,et al.  SemEval-2014 Task 10: Multilingual Semantic Textual Similarity , 2014, *SEMEVAL.