论文信息 - Classifying Taxonomic Relations between Pairs of Wikipedia Articles

Classifying Taxonomic Relations between Pairs of Wikipedia Articles

Natural language generation systems rely on taxonomic thesauri for tasks such as lexical choice and aggregation. WordNet is one such taxonomy, but it is limited in size. Motivated by the needs of a generation system in the scientific literature domain, we present a method for building a taxonomic thesaurus from Wikipedia articles, where each article represents a potential concept in the taxonomy. We propose framing the problem of creating a taxonomy as a classification task of the potential relations between individual Wikipedia article pairs, and show that a supervised algorithm can achieve high precision in this task with very little training data.

Kathleen McKeown | Or Biran

[1] Ian H. Witten,et al. Learning to link with wikipedia , 2008, CIKM '08.

[2] Daniel S. Weld,et al. Automatically refining the wikipedia infobox ontology , 2008, WWW.

[3] Dan Roth,et al. Constraints Based Taxonomic Relation Classification , 2010, EMNLP.

[4] Rada Mihalcea,et al. Using Wikipedia for Automatic Word Sense Disambiguation , 2007, NAACL.

[5] Kenneth Ward Church,et al. Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[6] Jens Lehmann,et al. DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[7] Simone Paolo Ponzetto,et al. Large-Scale Taxonomy Mapping for Restructuring and Integrating Wikipedia , 2009, IJCAI.

[8] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[9] Hongyan Jing,et al. Usage of WordNet in Natural Language Generation , 1998, WordNet@ACL/COLING.

[10] Fabian M. Suchanek,et al. Yago: A Core of Semantic Knowledge Unifying WordNet and Wikipedia , 2007 .

[11] Tim Finin,et al. Unsupervised techniques for discovering ontology elements from Wikipedia article links , 2010, HLT-NAACL 2010.

[12] Andrei Popescu-Belis,et al. A Random Walk Framework to Compute Textual Semantic Similarity: A Unified Model for Three Benchmark Tasks , 2010, 2010 IEEE Fourth International Conference on Semantic Computing.