Purdue at TREC 2010 Entity Track: A Probabilistic Framework for Matching Types Between Candidate and Target Entities

This paper gives an overview of our work for the TREC 2010 Entity track. The goal of the TREC Entity track is to study entity-related searches on Web data, which has not been sufficiently addressed in prior research. For both the Related Entity Finding (REF) task and the Entity List Completion (ELC) task in this track, we propose a unified probabilistic framework by incorporating the matching between target entity types and candidate entity types. This framework is motivated by the observation that much more specific type information than the given type can be inferred from the query narratives. These fine-grained types can help narrow down candidate entities. Specific probabilistic models can be derived from this general framework. For the REF task, besides the type matching component, we generally follow our previous work on TREC Entity 2009. For the ELC task, we apply the same framework and the resulting model combines structured document retrieval with type matching.