We describe a method for improving the precision of metasearch results based upon scoring the visual features of documents' surrogate representations. These surrogate scores are used during fusion in place of the original scores or ranks provided by the underlying search engines. Visual features are extracted from typical search result surrogate information, such as title, snippet, URL, and rank. This approach specifically avoids the use of search engine-specific scores and collection statistics that are required by most traditional fusion strategies. This restriction correctly reflects the use of metasearch in practice, in which knowledge of the underlying search engines' strategies cannot be assumed. We evaluate our approach using a precision-oriented test collection of manually-constructed binary relevance judgments for the top ten results from ten web search engines over 896 queries. We show that our visual fusion approach significantly outperforms the rCombMNZ fusion algorithm by 5.71%, with 99% confidence, and the best individual web search engine by 10.9%, with 99% confidence.
[1]
Edward A. Fox,et al.
Combination of Multiple Searches
,
1993,
TREC.
[2]
Ophir Frieder,et al.
A framework for determining necessary query set sizes to evaluate web search effectiveness
,
2005,
WWW '05.
[3]
John C. Platt,et al.
Fast training of support vector machines using sequential minimal optimization, advances in kernel methods
,
1999
.
[4]
Javed A. Aslam,et al.
Condorcet fusion for improved retrieval
,
2002,
CIKM '02.
[5]
Ophir Frieder,et al.
Fusion of effective retrieval strategies in the same information retrieval system
,
2004,
J. Assoc. Inf. Sci. Technol..
[6]
Ellen M. Voorhees,et al.
Evaluating Evaluation Measure Stability
,
2000,
SIGIR 2000.
[7]
Abdur Chowdhury,et al.
Operational requirements for scalable search systems
,
2003,
CIKM '03.