Learning to Augment a Machine-Readable Dictionary

Dictionaries will always be incomplete; sometimes a word will acquire a new sense in a technical field, and new words are being added to the language all the time. This paper will discuss our comparisons between a machine-readable dictionary and various information retrieval test collections. We will first report on the number of words found in the dictionary, and how much improvement is gained by going to a larger dictionary. We will then discuss experiments concerned with augmenting the dictionary with information acquired from the corpus, and by exploiting redundancy within the dictionary itself.