Learning Named Entity Hyponyms for Question Answering

Lexical mismatch is a problem that confounds automatic question answering systems. While existing lexical ontologies such as WordNet have been successfully used to match verbal synonyms (e.g., beat and defeat) and common nouns (tennis is-a sport), their coverage of proper nouns is less extensive. Question answering depends substantially on processing named entities, and thus it would be of significant benefit if lexical ontologies could be enhanced with additional hypernymic (i.e., is-a) relations that include proper nouns, such as Edward Teach is-a pirate. We demonstrate how a recently developed statistical approach to mining such relations can be tailored to identify named entity hyponyms, and how as a result, superior question answering performance can be obtained. We ranked candidate hyponyms on 75 categories of named entities and attained 53% mean average precision. On TREC QA data our method produces a 9% improvement in performance.