INAOE_UPV-CORE: Extracting Word Associations from Document Corpora to estimate Semantic Textual Similarity

This paper presents three methods to evaluate the Semantic Textual Similarity (STS). The first two methods do not require labeled training data; instead, they automatically extract semantic knowledge in the form of word associations from a given reference corpus. Two kinds of word associations are considered: cooccurrence statistics and the similarity of word contexts. The third method was done in collaboration with groups from the Universities of Paris 13, Matanzas and Alicante. It uses several word similarity measures as features in order to construct an accurate prediction model for the STS.