Disambiguation: a Study in Weighted Preference*

The automatic construction of an IS_A taxonomy of noun senses from a machine readable dictionary (MRD) has long been sought, but achieved with only limited success. The task requires the solution to two problems: 1) To define an algorithm to automatically identify the genus or hypernym of a noun definition, and 2) to define an algorithm for lexical disambiguation of the genus term. In the last few years, effective methods for solving the first problem have been developed, but the problem of creating an algorithm for lexical disambiguation of the genus terms is one that has proven to be very difficult. In COLING 90 we described our initial work on the automatic creation of a taxonomy of noun senses from Longman's Dictionary of Contemporary English (LDOCE). The algorithm for lexical disambiguation of the genus term was accurate about 80% of the time and made use of the semantic categories, the subject area markings and the frequency of use information in LDOCE. In this paper we report a series of experiments which weight the three factors in various ways, and describe our improvements to the algorithm (to about 90% accuracy).