Sense clusters for Information Retrieval: Evidence from Semcor and the EuroWordNet InterLingual Index

We examine three different types of sense clustering criteria with an Information Retrieval application in mind: methods based on the wordnet structure (such as generalization, cousins, sisters...); co-occurrence of senses obtained from Semcor; and equivalent translations of senses in other languages via the EuroWordNet InterLingual Index (ILI). We conclude that a) different NLP applications demand not only different sense granularities but different (possibly overlapped) sense clusterings. b) co-occurrence of senses in Semcor provide strong evidence for Information Retrieval clusters, unlike methods based on wordnet structure and systematic polysemy. c) parallel polysemy in three or more languages via the ILI, besides providing sense clusters for MT and CLIR, is strongly correlated with co-occurring senses in Semcor, and thus can be useful for Information Retrieval as well.