WSD system based on specialized Hidden Markov Model (upv-shmm-eaw)

We present a supervised approach to Word Sense Disambiguation (WSD) based on Specialized Hidden Markov Models. We used as training data the Semcor corpus and the test data set provided by Senseval 2 competition and as dictionary the Wordnet 1.6. We evaluated our system on the English all-word task of the Senseval-3 competition. 1 Description of the WSD System We consider WSD to be a tagging problem (Molina et al., 2002a). The tagging process can be formulated as a maximization problem using the Hidden Markov Model (HMM) formalism. Let O be the set of output tags considered, and I , the input vocabulary of the application. Given an input sentence, I = i1, . . . , iT , where ij ∈ I , the tagging process consists of finding the sequence of tags (O = o1, . . . , oT , where oj ∈ O) of maximum probability on the model, that is: Ô = arg max O P (O|I) = arg max O ( P (O) · P (I|O)