Machine tractable dictionaries: design and construction

The purpose of the research in this volume is to design a machine-tractable dictionary from the Longman Dictionary of Contemporary English (LDOCE). A machine-tractable dictionary is intended to be a basic facility for a whole spectrum of natural language processing tasks. The research adopts a compositional-reduction approach to obtain a set of empirically derived definitional primitives and use them to construct formalized sense entries in a nested predicate form where the predicates are a set of definitional primitives called "seed senses". Over 40 years of continuous effort at natural language processing have led the research community in this area to the realization that very large machine tractable dictionaries are essential to success in any further computational attempts at natural language. The emergence of machine-readable data, such as dictionaries, encyclopedias, and documents of a general, unrestricted nature as by-products of modern typesetting technology, facilitates the derivation of very large lexicons and knowledge bases at low costs. An open research question in computation lexicography in particular and natural language processing in general involves the machine tractability of these lexicons. A lexicon is machine tractable only when it assists copmuter understanding of natural language text as well as the acquisition of new lexical and world knowledge by the computer.