Learning Taxonomic Relations from Heterogeneous Evidence

We present a novel approach to the automatic acquisition of taxonomic relations. The main difference to earlier approaches is that we do not only consider one single source of evidence, i.e. a specific algorithm or approach, but examine the possibility of learning taxonomic relations by considering various and heterogeneous forms of evidence. In particular, we derive these different evidences by using well-known NLP techniques and resources and combine them via two simple strategies. Our approach shows very promising results compared to other results from the literature. The main aim of the work presented in this paper is (i) to gain insight into the behaviour of different approaches to learn taxonomic relations, (ii) to provide a first step towards combining these different approaches, and (iii) to establish a baseline for further research.

[1]  Massimo Poesio,et al.  Acquiring Lexical Knowledge for Anaphora Resolution , 2002, LREC.

[2]  Ralph Grishman,et al.  Grammatically-based automatic word class formation , 1975, Inf. Process. Manag..

[3]  Fabio Ciravegna,et al.  Adaptive Information Extraction from Text by Rule Induction and Generalisation , 2001, IJCAI.

[4]  Eugene Charniak,et al.  Finding Parts in Very Large Corpora , 1999, ACL.

[5]  Zellig S. Harris,et al.  Mathematical structures of language , 1968, Interscience tracts in pure and applied mathematics.

[6]  Rada Mihalcea,et al.  Using WordNet and Lexical Operators to Improve Internet Searches , 2000, IEEE Internet Comput..

[7]  Naftali Tishby,et al.  Distributional Clustering of English Words , 1993, ACL.

[8]  Steffen Staab,et al.  Ontologies improve text document clustering , 2003, Third IEEE International Conference on Data Mining.

[10]  Noah A. Smith,et al.  The Web as a Parallel Corpus , 2003, CL.

[11]  Christophe Jouis Contributions à la conceptualisation et à la Modélisation des connaissances à partir d'une analyse linguistique de textes : réalisation d'un prototype : le système SEEK , 1993 .

[12]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[13]  Steffen Staab,et al.  Clustering ontologies from text , 2004 .

[14]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[15]  Steffen Staab,et al.  Towards the self-annotating web , 2004, WWW '04.

[16]  Olatz Ansa,et al.  Enriching very large ontologies using the WWW , 2000, ECAI Workshop on Ontology Learning.

[17]  Paola Velardi,et al.  Using text processing techniques to automatically enrich a domain ontology , 2001, FOIS.

[18]  Gilles Bisson,et al.  Designing Clustering Methods for Ontology Building - The Mo'K Workbench , 2000, ECAI Workshop on Ontology Learning.

[19]  Claire Cardie,et al.  Combining Sample Selection and Error-Driven Pruning for Machine Learning of Coreference Rules , 2002, EMNLP.

[20]  Udo Hahn,et al.  Towards Text Knowledge Engineering , 1998, AAAI/IAAI.

[21]  Sharon A. Caraballo Automatic construction of a hypernym-labeled noun hierarchy from text , 1999, ACL.

[22]  Donald Hindle,et al.  Noun Classification From Predicate-Argument Structures , 1990, ACL.

[23]  Frank Keller,et al.  Using the Web to Overcome Data Sparseness , 2002, EMNLP.

[24]  Steffen Staab,et al.  Measuring Similarity between Ontologies , 2002, EKAW.

[25]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[26]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[27]  Malvina Nissim,et al.  Using the Web for Nominal Anaphora Resolution , 2003 .

[28]  Khurshid Ahmad,et al.  Corpus-Based Thesaurus Construction for Image Retrieval in Specialist Domains , 2003, ECIR.

[29]  David Faure,et al.  A corpus-based conceptual clustering method for verb frames and ontology , 1998 .

[30]  W. Bruce Croft,et al.  Deriving concept hierarchies from text , 1999, SIGIR '99.