Senseval: The CL Research Experience

The CL Research Senseval system wasthe highest performing system among the ``All-words''systems, with an overall fine-grained score of 61.6percent for precision and 60.5 percent for recall on98 percent of the 8,448 texts on the revisedsubmission (up by almost 6 and 9 percent from thefirst). The results were achieved with an almostcomplete reliance on syntactic behavior, using (1) arobust and fast ATN-style parser producing parse treeswith annotations on nodes, (2) DIMAP dictionarycreation and maintenance software (after conversion ofthe Hector dictionary files) to hold dictionaryentries, and (3) a strategy for analyzing the parsetrees in concert with the dictionary data. Furtherconsiderable improvements are possible in the parser,exploitation of the Hector data (and representation ofdictionary entries), and the analysis strategy, stillwith syntactic and collocational data. The Sensevaldata (the dictionary entries and the corpora) providean excellent testbed for understanding the sources offailures and for evaluating changes in the CL Researchsystem.