论文信息 - Covering the vocabulary of technical abstracts using standard and specialized dictionaries

Covering the vocabulary of technical abstracts using standard and specialized dictionaries

Natural language applications such as information retrieval systems will increasingly rely on standard dictionaries (in machine-readable form) as a source of lexical information. It is therefore important to determine how well the dictionar ies cover the text that such systems are likely to encounter. Data gathered from an examination of computer science and library and information science abstracts show that the techni cal terms from the domains are not well covered by standard dictionaries Coverage improves with the use of specialized computer science and library and information science dictio naries, although there is great variation in their performance However, these dictionaries were not designed for automatic text processing, and so this paper concludes with a discussion of the difficulties of incorporating such dictionaries into natu ral language processing systems.

Stephanie W. Haas

[1] Martha W. Evens,et al. Semantically Significant Patterns in Dictionary Definitions , 1986, ACL.

[2] Susan McRoy,et al. Using Multiple Knowledge Sources for Word Sense Discrimination , 1992, Comput. Linguistics.

[3] Ralph Grishman,et al. Analyzing language in restricted domains : sublanguage description and processing , 1986 .

[4] Edward A. Fox,et al. Building a Large Thesaurus for Information Retrieval , 1988, ANLP.

[5] Geoffrey Sampson,et al. How Fully Does a Machine-Usable Dictionary Cover English Text? , 1989 .

[6] Yorick Wilks,et al. A tractable machine dictionary as a resource for computational semantics , 1989 .