In a pilot application based on web search engine called Web-based Relation Completion (WebRC), we propose to join two columns of entities linked by a predefined relation by mining knowledge from the web through a web search engine. To achieve this, a novel retrieval task Relation Query Expansion (RelQE) is modelled: given an entity (query), the task is to retrieve documents containing entities in predefined relation to the given one. Solving this problem entails expanding the query before submitting it to a web search engine to ensure that mostly documents containing the linked entity are returned in the top K search results. In this paper, we propose a novel Learning-based Relevance Feedback (LRF) approach to solve this retrieval task. Expansion terms are learned from training pairs of entities linked by the predefined relation and applied to new entity-queries to find entities linked by the same relation. After describing the approach, we present experimental results on real-world web data collections, which show that the LRF approach always improves the precision of top-ranked search results to up to 8.6 times the baseline. Using LRF, WebRC also shows performances way above the baseline.
[1]
Xiaoyong Du,et al.
Approximate membership localization (AML) for web-based join
,
2010,
CIKM '10.
[2]
S. B. Needleman,et al.
A general method applicable to the search for similarities in the amino acid sequence of two proteins.
,
1970,
Journal of molecular biology.
[3]
Gianluca Demartini,et al.
Evaluating Relation Retrieval for Entities and Experts
,
2008
.
[4]
ChengXiang Zhai,et al.
Positional relevance model for pseudo-relevance feedback
,
2010,
SIGIR.
[5]
R. French.
The computational modeling of analogy-making
,
2002,
Trends in Cognitive Sciences.
[6]
John D. Lafferty,et al.
A study of smoothing methods for language models applied to Ad Hoc information retrieval
,
2001,
SIGIR '01.
[7]
W. Bruce Croft,et al.
Relevance-Based Language Models
,
2001,
SIGIR '01.
[8]
Michael L. Littman,et al.
Corpus-based Learning of Analogies and Semantic Relations
,
2005,
Machine Learning.