Comparing rankings of search results on the Web

The Web has become an information source for professional data gathering. Because of the vast amounts of information on almost all topics, one cannot systematically go over the whole set of results, and therefore must rely on the ordering of the results by the search engine. It is well known that search engines on the Web have low overlap in terms of coverage. In this study we measure how similar are the rankings of search engines on the overlapping results.We compare rankings of results for identical queries retrieved from several search engines. The method is based only on the set of URLs that appear in the answer sets of the engines being compared. For comparing the similarity of rankings of two search engines, the Spearman correlation coefficient is computed. When comparing more than two sets Kendall's W is used. These are well-known measures and the statistical significance of the results can be computed. The methods are demonstrated on a set of 15 queries that were submitted to four large Web search engines. The findings indicate that the large public search engines on the Web employ considerably different ranking algorithms.

[1]  Liwen Vaughan,et al.  New measurements for search engine evaluation proposed and tested , 2004, Inf. Process. Manag..

[2]  Ronald Fagin,et al.  Comparing top k lists , 2003, SODA '03.

[3]  Hsin-Liang Chen,et al.  Evaluation of Web-Based Search Engines from the End-User's Perspective: A Pilot Study , 1998 .

[4]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[5]  Amanda Spink,et al.  U.S. versus European web searching trends , 2002, SIGF.

[6]  Yin Zhang Scholarly use of internet-based electronic resources , 2001, J. Assoc. Inf. Sci. Technol..

[7]  S. Lawrence Free online availability substantially increases a paper's impact , 2001, Nature.

[8]  S. Herring Use of Electronic Resources in Scholarly Electronic Journals: A Citation Analysis. , 2002 .

[9]  Peter Bailey,et al.  Measuring Search Engine Quality , 2001, Information Retrieval.

[10]  Monika Henzinger,et al.  Analysis of a very large web search engine query log , 1999, SIGF.

[11]  Peter J. Snyder,et al.  The referencing of internet web sites in medical and scientific publications , 2002, Brain and Cognition.

[12]  Andrei Z. Broder,et al.  A Technique for Measuring the Relative Size and Overlap of Public Web Search Engines , 1998, Comput. Networks.

[13]  Derek Rowntree,et al.  Statistics without tears : a primer for non-mathematicians , 1982 .

[14]  Judit Bar-Ilan,et al.  Dynamics of Search Engine Rankings - A Case Study , 2004, WebDyn@WWW.

[15]  C. Lee Giles,et al.  Accessibility of information on the web , 1999, Nature.