Automatic Detection of Thesaurus relations for Information Retrieval Applications

Is it possible to discover semantic term relations useful for thesauri without any semantic information? Yes, it is. A recent approach for automatic thesaurus construction is based on explicit linguistic knowledge, i.e. a domain independent parser without any semantic component, and implicit linguistic knowledge contained in large amounts of real world texts. Such texts include implicitly the linguistic, especially semantic, knowledge that the authors needed for formulating their texts. This article explains how implicit semantic knowledge can be transformed to an explicit one. Evaluations of quality and performance of the approach are very encouraging.

[1]  J. Fodor,et al.  The structure of a semantic theory , 1963 .

[2]  J. Fodor,et al.  The Structure of Language: Readings in the Philosophy of Language , 1966 .

[3]  Michael Lesk,et al.  Word-word associations in document retrieval systems , 1969 .

[4]  Gerard Salton,et al.  Automatic term class construction using relevance--A summary of work in automatic pseudoclassification , 1980, Inf. Process. Manag..

[5]  Padmini Das-Gupta Boolean Interpretation of Conjunctions for Document Retrieval. , 1987 .

[6]  Ulrich Güntzer,et al.  Automatic thesaurus construction by machine learning from retrieval sessions , 1989, Inf. Process. Manag..

[7]  Gerda Ruge,et al.  Effectiveness and Efficiency in Natural Language Processing for Large Amounts of Text. , 1991 .

[8]  Gerda Ruge,et al.  Effectiveness and efficiency in natural language processing for large amounts of text , 1991, J. Am. Soc. Inf. Sci..

[9]  Gerda Ruge,et al.  Experiments on Linguistically-Based Term Associations , 1992, Inf. Process. Manag..

[10]  Tomek Strzalkowski,et al.  TTP: A Fast and Robust Parser for Natural Language , 1992, COLING.

[11]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[12]  Gregory Grefenstette,et al.  Explorations in automatic thesaurus discovery , 1994 .

[13]  Antonio Zampolli,et al.  Computational approaches to the lexicon , 1994 .

[14]  Tomek Strzalkowski Natural Language Information Retrieval , 1995, Inf. Process. Manag..

[15]  Gerda Ruge Wortbedeutung und Termassoziation: Methoden zur automatischen semantischen Klassifikation , 1995 .

[16]  Hinrich Schütze,et al.  A Cooccurrence-Based Thesaurus and Two Applications to Information Retrieval , 1994, Inf. Process. Manag..