Using wikipedia and supersense tagging for semi-automatic complex taxonomy construction

In this paper we propose an unsupervised approach for acquiring domain related conceptual hierarchies from open-domain text. Super Sense Tagging (SST) is used to extract up-level terms and Wikipedia categories and WordNet are employed to construct the rest of taxonomic hierarchy. The result is a complete top-bottom taxonomy for every formal context. We describe both the method we implemented and some encoruaging initial experimental results.

[1]  W. Bruce Croft,et al.  Deriving concept hierarchies from text , 1999, SIGIR '99.

[2]  Massimiliano Ciaramita,et al.  Supersense Tagging of Unknown Nouns in WordNet , 2003, EMNLP.

[3]  Kentaro Torisawa,et al.  Exploiting Wikipedia as External Knowledge for Named Entity Recognition , 2007, EMNLP.

[4]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[5]  Razvan C. Bunescu,et al.  Using Encyclopedic Knowledge for Named entity Disambiguation , 2006, EACL.

[6]  Emanuele Pianta,et al.  Ontology Population from Textual Mentions: Task Definition and Benchmark , 2006, OntologyLearning@COLING/ACL.

[7]  Paul Buitelaar,et al.  A Protégé Plug-In for Ontology Extraction from Text Based on Linguistic Analysis , 2004, ESWS.

[8]  C. Fellbaum An Electronic Lexical Database , 1998 .

[9]  Marie-Francine Moens,et al.  Efficient Hierarchical Entity Classifier Using Conditional Random Fields , 2006, OntologyLearning@COLING/ACL.

[10]  Deborah L. McGuinness,et al.  Owl web ontology language guide , 2003 .

[11]  Davide Picca Semantic Domains and Supersense Tagging for Domain-Specific Ontology Learning , 2007, RIAO.

[12]  Johanna Völker,et al.  Towards large-scale, open-domain and ontology-based named entity classification , 2005 .

[13]  Paola Velardi,et al.  Ontology Enrichment Through Automatic Semantic Annotation of On-Line Glossaries , 2006, EKAW.

[14]  Yasemin Altun,et al.  Broad-Coverage Sense Disambiguation and Information Extraction with a Supersense Sequence Tagger , 2006, EMNLP.

[15]  Manabu Okumura,et al.  Towards Large-scale Non-taxonomic Relation Extraction: Estimating the Precision of Rote Extractors , 2006, OntologyLearning@COLING/ACL.

[16]  Antonio Toral,et al.  A proposal to automatically build and maintain gazetteers for Named Entity Recognition by using Wikipedia , 2006, Workshop On New Text Wikis And Blogs And Other Dynamic Text Sources.

[17]  Steffen Staab,et al.  Learning Taxonomic Relations from Heterogeneous Evidence , 2004 .