论文信息 - "More like these": growing entity classes from seeds

"More like these": growing entity classes from seeds

We present a corpus-based approach to the class expansion task. For a given set of seed entities we use co-occurrence statistics taken from a text collection to define a membership function that is used to rank candidate entities for inclusion in the set. We describe an evaluation framework that uses data from Wikipedia. The performance of our class extension method improves as the size of the text collection increases.

Valentin Jijkoun | Maarten de Rijke | Eugénio C. Oliveira | Luís Sarmento

[1] Katherine A. Heller,et al. Bayesian Sets , 2005, NIPS.

[2] Ellen Riloff,et al. A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts , 2002, EMNLP.

[3] M. de Rijke,et al. Completing lists of entities , 2009 .

[4] Brian Roark,et al. Noun-Phrase Co-Occurence Statistics for Semi-Automatic Semantic Lexicon Construction , 1998, COLING-ACL.

[5] Mounia Lalmas,et al. Advances in XML retrieval: the INEX initiative , 2006, IWRIDL '06.

[6] Dominic Widdows,et al. A Graph Model for Unsupervised Lexical Acquisition , 2002, COLING.