Sequential Supervised Learning for Hypernym Discovery from Wikipedia

Hypernym discovery is an essential task for building and extending ontologies automatically. In comparison to the whole Web as a source for information extraction, online encyclopedias provide far more structuredness and reliability. In this paper we propose a novel approach that combines syntactic and lexical-semantic information to identify hypernymic relationships. We compiled semi-automatically and manually created training data and a gold standard for evaluation with the first sentences from the German version of Wikipedia. We trained a sequential supervised learner with a semantically enhanced tagset. The experiments showed that the cleanliness of the data is far more important than the amount of the same. Furthermore, it was shown that bootstrapping is a viable approach to ameliorate the results. Our approach outperformed the competitive lexico-syntactic patterns by 7% leading to an F 1-measure of over .91.

[1]  Ebroul Izquierdo,et al.  Combining image captions and visual analysis for image concept classification , 2008, MDM '08.

[2]  Robert Porzel,et al.  Resolution of Lexical Ambiguities in Spoken Dialogue System , 2004, SIGDIAL Workshop.

[3]  Steven P. Abney,et al.  Bootstrapping , 2002, ACL.

[4]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[5]  Christer Samuelsson,et al.  Morphological Tagging Based Entirely on Bayesian Inference , 1993, NODALIDA.

[6]  Christian Biemann,et al.  Supporting Web-based Address Extraction with Unsupervised Tagging , 2007, GfKl.

[7]  Sharon A. Caraballo Automatic construction of a hypernym-labeled noun hierarchy from text , 1999, ACL.

[8]  Dan Tufis,et al.  Tagging romanian texts: a case study for QTAG, a language independent probabilistic tagger , 1998 .

[9]  Daniel Jurafsky,et al.  Semantic Taxonomy Induction from Heterogenous Evidence , 2006, ACL.

[10]  Thorsten Brants,et al.  TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.

[11]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[12]  Kentaro Torisawa,et al.  Exploiting Wikipedia as External Knowledge for Named Entity Recognition , 2007, EMNLP.

[13]  Ellen Riloff,et al.  Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs , 2008, ACL.

[14]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.