We analyze the results of a semantic annotation task performed by novice taggers as part of the WordNet SemCor project (Landes et al., in press). Each polysemous content word in a text was matched to a sense from WordNet. Comparing the performance of the novice taggers with that of experienced lexicographers, we find that the degree of polysemy, part of speech, and the position within the WordNet entry of the target words played a role in the taggers' choices. The taggers agreed on a sense choice more often than they agreed with two lexicographers, suggesting an effect of experience on sense distinction. Evidence indicates that taggers selecting senses from a list ordered by frequency of occurrence, where salient, core senses are found at the beginning of the entry, use a different strategy than taggers working with a randomly ordered list of senses. 1 I n t r o d u c t i o n Our present understanding of how the meanings of polysemous words are represented in speakers' minds and accessed during language use is poor. One model of the mental lexicon, implicit in much of computational linguistics, likens it to a dictionary, with a discrete entry for each word form and each sense of a polysemous word form. Language production and comprehension then would simply require "looking up" the appropriate entry and selecting the intended meaning. If this model of the mental lexicon, with its discrete and non-overlapping sense representations, were correct, both the creation and the use of dictionaries would be straightforward. Lexicographers collect large numbers of occurrences of words from a corpus. Interpreting the dif3 4 ferent meanings of polysemous words from the corpus presents no dit~culty, since lexicographers simply do what they do as competent speakers of the language. The step that is particular to lexicography is transforming the corpus occurrences of a given word form into a number of discrete senses in the format of dictionary entries. Cross-dictionary comparisons show that carving up the different meanings of a polysemous word into discrete dictionary senses is difficult. The number of senses for a polysemous word often differs, reflecting "lumping" versus "splitting" strategies; some senses are absent from one but not another dictionary. Yet postulating different mental lexicons seems unwarranted, given our rapid and successful communication. Rather, the mapping process from occurrence to dictionary entry may give rise to difficulties and discrepancies across dictionaries because speakers' meaning representations may not resemble those of dictionaries with their fiat and discrete senses, thus making lexicography an artificial and therefore challenging task. Semantic tagging is the inverse of lexicography, in that taggers identify and interpret dictionary entries with respect to words occurring in a text. Taggers, like lexicographers, first interpret the target word in the text, and then match the meaning they have identified for a given occurrence of a polysemous word with one of several dictionary senses. Our goal was to examine the difficulties associated with semantic tagging. Because taggers are faced with the same task as lexicographers-although the the former select, rather than create, dictionary senses to match word occurrences in text-we expected to see discrepancies among the results of the semantic annotation task across taggers. Moreover, we guessed that those polysemous words that receive very different treatments across dictionaries would also be tagged differently by the annotators.
[1]
James Pustejovsky,et al.
The Generative Lexicon
,
1995,
CL.
[2]
George A. Miller,et al.
Using a Semantic Concordance for Sense Identification
,
1994,
HLT.
[3]
George A. Miller,et al.
Introduction to WordNet: An On-line Lexical Database
,
1990
.
[4]
Dedre Gentner,et al.
The Verb Mutability Effect: Studies of the Combinatorial Semantics of Nouns and Verbs
,
1990
.
[5]
Christiane Fellbaum,et al.
Building Semantic Concordances
,
1998
.
[6]
J. Katz.
Semantic Theory and the Meaning of 'Good'
,
1964
.
[7]
G. Miller,et al.
Semantic networks of english
,
1991,
Cognition.
[8]
David Yarowsky,et al.
Estimating Upper and Lower Bounds on the Performance of Word-Sense Disambiguation Programs
,
1992,
ACL.
[9]
Adam Kilgarriff,et al.
Dictionary word sense distinctions: An enquiry into their nature
,
1992,
Comput. Humanit..