论文信息 - UMBC_EBIQUITY-CORE: Semantic Textual Similarity Systems - 字舞流文

UMBC_EBIQUITY-CORE: Semantic Textual Similarity Systems

We describe three semantic text similarity systems developed for the *SEM 2013 STS shared task and the results of the corresponding three runs. All of them shared a word similarity feature that combined LSA word similarity and WordNet knowledge. The first, which achieved the best mean score of the 89 submitted runs, used a simple term alignment algorithm augmented with penalty terms. The other two runs, ranked second and fourth, used support vector regression models to combine larger sets of features.

Jonathan Weese | Timothy W. Finin | Lushan Han | James Mayfield | Abhay L. Kashyap | Lushan Han | J. Mayfield | Jonathan Weese | Abhay Lokesh Kashyap

[1] Zellig S. Harris,et al. Mathematical structures of language , 1968, Interscience tracts in pure and applied mathematics.

[2] Richard A. Harshman,et al. Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[3] T. Landauer,et al. Indexing by Latent Semantic Analysis , 1990 .

[4] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.

[5] Charles T. Meadow,et al. Text information retrieval systems , 1992 .

[6] Martha Palmer,et al. Verb Semantics and Lexical Selection , 1994, ACL.

[7] T. Landauer,et al. A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[8] Curt Burgess,et al. Explorations in context space: Words, sentences, discourse , 1998 .

[9] Dekang Lin,et al. Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[10] Mark Stevenson,et al. The Reuters Corpus Volume 1 -from Yesterday’s News to Tomorrow’s Language Resources , 2002, LREC.

[11] Michael Collins,et al. Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[12] David McLean,et al. An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources , 2003, IEEE Trans. Knowl. Data Eng..

[13] Steven Bird,et al. NLTK: The Natural Language Toolkit , 2002, ACL.

[14] Berthier A. Ribeiro-Neto,et al. Image retrieval using multiple evidence ranking , 2004, IEEE Transactions on Knowledge and Data Engineering.

[15] Chris Quirk,et al. Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources , 2004, COLING.

[16] Mehran Sahami,et al. A web-based kernel function for measuring the similarity of short text snippets , 2006, WWW '06.

[17] Regina Barzilay,et al. Paraphrasing for Automatic Evaluation , 2006, NAACL.

[18] James R. Curran. Proceedings of the COLING/ACL on Interactive presentation sessions , 2006 .

[19] Carlo Strapparava,et al. Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[20] E. Loper,et al. NLTK: The Natural Language Toolkit , 2006, ACL 2006.

[21] Susan T. Dumais,et al. Similarity Measures for Short Segments of Text , 2007, ECIR.

[22] Graeme Hirst,et al. Computing Word-Pair Antonymy , 2008, EMNLP.

[23] Silvia Bernardini,et al. The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[24] Björn-Olav Dozo,et al. Quantitative Analysis of Culture Using Millions of Digitized Books , 2010 .

[25] Hakan Ferhatosmanoglu,et al. Short text classification in twitter to improve information filtering , 2010, SIGIR.

[26] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[27] Eneko Agirre,et al. SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity , 2012, *SEMEVAL.

[28] Timothy W. Finin,et al. Schema-free structured querying of DBpedia data , 2012, CIKM.

[29] Jan Snajder,et al. TakeLab: Systems for Measuring Semantic Text Similarity , 2012, *SEMEVAL.

[30] Timothy W. Finin,et al. Improving Word Similarity by Augmenting PMI with Estimates of Word Polysemy , 2013, IEEE Transactions on Knowledge and Data Engineering.

[31] Chris Callison-Burch,et al. PPDB: The Paraphrase Database , 2013, NAACL.

[32] Eneko Agirre,et al. *SEM 2013 shared task: Semantic Textual Similarity , 2013, *SEMEVAL.

[33] Dekang Lin. Automatic Retrieval and Clustering of Similar Words , 2022, COLING.