TLC is a supervised training (S) system that uses a Bayesian statistical model and features of a word's context to identify word sense. We describe the classifier's operation and how it can be configured to use only topical context cues, only local cues, or a combination of both. Our results on Senseval's final run are presented along with a comparison to the performance of the best S system and the average for S systems. We discuss ways to improve TLC by enriching its feature set and by substituting other decision procedures for the Bayesian model. Future development of supervised training classifiers will depend on the availability of tagged training data. TLC can assist in the hand-tagging effort by helping human taggers locate infrequent senses of polysemous words.
[1]
Christiane Fellbaum,et al.
Book Reviews: WordNet: An Electronic Lexical Database
,
1999,
CL.
[2]
George A. Miller,et al.
Using Corpus Statistics and WordNet Relations for Sense Identification
,
1998,
CL.
[3]
I. Good.
THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS
,
1953
.
[4]
Keh-Yih Su,et al.
Robust Learning, Smoothing, and Parameter Tying on Syntactic Ambiguity Resolution
,
1995,
Comput. Linguistics.
[5]
Srinivas Bangalore,et al.
The Institute For Research In Cognitive Science Disambiguation of Super Parts of Speech ( or Supertags ) : Almost Parsing by Aravind
,
1995
.