论文信息 - Sentence-Level Semantic Textual Similarity Using Word-Level Semantics

Sentence-Level Semantic Textual Similarity Using Word-Level Semantics

Estimating semantic textual similarity between sentences is indispensable for many information retrieval tasks. Traditional lexical similarity measures cannot compute the similarity beyond a trivial level. Moreover, they only can capture textual similarity, but not semantic. Researchers proposed methods using a variety of approaches. In this paper, we propose a novel method for semantic textual similarity that leverages word-level semantics to compute the sentence-level semantic similarity. We introduced two new semantic similarity measures based on word-embedding models trained on two different corpora. Apart from these, another semantic similarity measure is also introduced using the word sense comparison. The similarity score between the sentence-pair is then computed by applying a linear ranking approach to all proposed measures with their importance estimated employing a linear regression model. We conducted experiments using the SemEval Semantic Textual Similarity (STS-2017) test collections. The experimental results demonstrated that our method is effective for measuring semantic textual similarity and outperformed some known related methods.

Masaki Aono | Md Shajalal

[1] Samuel Fernando,et al. A Semantic Similarity Approach to Paraphrase Detection , 2008 .

[2] Venkatesh Saligrama,et al. Zero-Shot Learning via Semantic Similarity Embedding , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3] Ramiz M. Aliguliyev,et al. A new sentence similarity measure and sentence based extractive technique for automatic text summarization , 2009, Expert Syst. Appl..

[4] Vasile Rus,et al. Measuring Semantic Similarity in Short Texts through Greedy Pairing and Word Semantics , 2012, FLAIRS Conference.

[5] Jonathan Weese,et al. UMBC_EBIQUITY-CORE: Semantic Textual Similarity Systems , 2013, *SEMEVAL.

[6] Hang Li,et al. Semantic Matching in Search , 2014, SMIR@SIGIR.

[7] Valentin Jijkoun,et al. Recognizing Textual Entailment Using Lexical Similarity , 2005 .

[8] Eneko Agirre,et al. SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation , 2017, *SEMEVAL.

[9] Jane Hunter,et al. UQeResearch: Semantic Textual Similarity Quantification , 2015, SemEval@NAACL-HLT.

[10] Carlo Strapparava,et al. Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[11] Susan T. Dumais,et al. Similarity Measures for Short Segments of Text , 2007, ECIR.