We present an algorithm for automatically disambiguating noun-noun compounds by deducing the correct semantic relation between their constituent words. This algorithm uses a corpus of 2,500 compounds annotated with WordNet senses and covering 139 different semantic relations (we make this corpus available online for researchers interested in the semantics of noun-noun compounds). The algorithm takes as input the WordNet senses for the nouns in a compound, finds all parent senses (hypernyms) of those senses, and searches the corpus for other compounds containing any pair of those senses. The relation with the highest proportional co-occurrence with any sense pair is returned as the correct relation for the compound. This algorithm was tested using a 'leave-one-out' procedure on the corpus of compounds. The algorithm identified the correct relations for compounds with high precision: in 92% of cases where a relation was found with a proportional co-occurrence of 1.0, it was the correct relation for the compound being disambiguated.
[1]
Barbara Rosario,et al.
The Descent of Hierarchy, and Selection in Relational Semantics
,
2002,
ACL.
[2]
George A. Miller,et al.
WordNet: A Lexical Database for English
,
1995,
HLT.
[3]
Fintan J. Costello,et al.
Investigating the Relations used in Conceptual Combination
,
2005,
Artificial Intelligence Review.
[4]
Slava M. Katz,et al.
Technical terminology: some linguistic properties and an algorithm for identification in text
,
1995,
Natural Language Engineering.
[5]
Christina L. Gagné,et al.
Influence of Thematic Relations on the Comprehension of Modifier–noun Combinations
,
1997
.
[6]
Judith N. Levi,et al.
The syntax and semantics of complex nominals
,
1978
.