论文信息 - Augmenting Domain-Specific Thesauri with Knowledge from Wikipedia

Augmenting Domain-Specific Thesauri with Knowledge from Wikipedia

propose a new method for extending a domain-specific thesaurus with valuable information from Wikipedia. The main obstacle is to disambiguate thesaurus concepts to c orrect Wikipedia articles. Given the concept name, we firs t identify candidate mappings by analyzing article titles, the ir redirects and disambiguation pages. Then, for each candidate, we compute a link-based similarity score to all mappin gs of context terms related to this concept. The article with the highest score is then used to augment the thesaurus concept. It i s the source for the extended gloss, explaining the concept's me aning, synonymous expressions that can be used as addition al non- descriptors in the thesaurus, translations of the c oncept into other languages, and new domain-relevant concepts.

David N. Milne | Olena Medelyan

[1] Paul M. B. Vitányi,et al. The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[2] Evgeniy Gabrilovich,et al. Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[3] David N. Milne. Computing Semantic Relatedness using Wikipedia Link Structure , 2007 .

[4] Jian Hu,et al. Improving Text Classification by Using Encyclopedia Knowledge , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[5] Maria Ruiz-Casado,et al. Automatic Assignment of Wikipedia Encyclopedic Entries to WordNet Synsets , 2005, AWIC.

[6] Simone Paolo Ponzetto,et al. WikiRelate! Computing Semantic Relatedness Using Wikipedia , 2006, AAAI.

[7] Ian H. Witten,et al. Mining Domain-Specific Thesauri from Wikipedia: A Case Study , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[8] Ehud Rivlin,et al. Placing search in context: the concept revisited , 2002, TOIS.