Special requirements for comparative evaluation of web search engines

Performance evaluation of classical information retrieval systems usually aims to assess the ability of these systems to find documents considered relevant to a certain search query based on a specific evaluation criteria. This approach, however, is not suitable to adequately evaluate some information retrieval applications such as web search engines. The web special characteristics make information retrieval tasks and the evaluation of search engines on the web face multiple challenges. Different web-specific, user-specific and language-specific requirements should be considered when designing and performing evaluation tests on operational web search engines. This paper discusses the special requirements for comprehensive comparative evaluation of different web search engines and highlights some language- specific considerations for evaluation in Arabic language.

[1]  Tefko Saracevic,et al.  Evaluation of evaluation in information retrieval , 1995, SIGIR '95.

[2]  Martin Bergman,et al.  The deep web:surfacing the hidden value , 2000 .

[3]  Jean M. Tague,et al.  The pragmatics of information retrieval experimentation , 1981 .

[4]  Dirk Lewandowski Aktualität als erfolgskritischer Faktor bei Suchmaschinen , 2006 .

[5]  Reginald Ferber,et al.  Information Retrieval - Suchmodelle und Data-Mining-Verfahren für Textsammlungen und das Web , 2003 .

[6]  Thomas Mandl,et al.  Evaluation of five web search engines in Arabic language , 2010, LWA.

[7]  David Graddol,et al.  The Future of English? : A guide to forecasting the popularity of the English language in the 21st century , 1997 .

[8]  Tao Zhou,et al.  Evolution of the Internet and its cores , 2008 .

[9]  Dirk Lewandowski,et al.  Web Searching: A Quality Measurement Perspective , 2008 .

[10]  Dirk Lewandowski,et al.  Exploring the academic invisible web , 2006, Libr. Hi Tech.

[11]  Yutaka Sayeki,et al.  EVALUATION OF EVALUATION , 1974 .

[12]  Thomas Mandl,et al.  Search results presentation and interface design: A comparative evaluation study of five web search engines in Arabic language , 2010, 2010 10th International Conference on Intelligent Systems Design and Applications.

[13]  Luis-Felipe Cabrera,et al.  AI Gets a Brain , 2006, ACM Queue.

[14]  Dirk Lewandowski Mit welchen Kennzahlen lässt sich die Qualität von Suchmaschinen messen , 2007 .

[15]  Duygu Tümer,et al.  An Empirical Evaluation on Semantic Search Performance of Keyword-Based and Semantic Search Engines: Google, Yahoo, Msn and Hakia , 2009, 2009 Fourth International Conference on Internet Monitoring and Protection.

[16]  Bassam H. Hammo Towards enhancing retrieval effectiveness of search engines for diacritisized Arabic documents , 2008, Information Retrieval.

[17]  B. Huberman,et al.  The Deep Web : Surfacing Hidden Value , 2000 .

[18]  Dirk Lewandowski,et al.  Problems with the use of web search engines to find results in foreign languages , 2008, Online Inf. Rev..

[19]  Dirk Lewandowski Zur Bewertung der Qualität von Suchmaschinen , 2007 .

[20]  AnnBritt Enochsson FINDING INFORMATION ON THE WORLD WIDE WEB , 1998 .

[21]  Najafi Azadeh,et al.  REAL LIFE, REAL USERS AND REAL NEEDS: A STUDY AND ANALYSIS OF USER QUERIES ON THE WEB , 2008 .

[22]  Omar Alonso,et al.  Crowdsourcing for relevance evaluation , 2008, SIGF.

[23]  G. Broll,et al.  Microsoft Corporation , 1999 .

[24]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[25]  Michael D. Gordon,et al.  Finding Information on the World Wide Web: The Retrieval Effectiveness of Search Engines , 1999, Inf. Process. Manag..

[26]  Marc Rittberger,et al.  Deutsche Suchmaschinen im Vergleich: AltaVista.de, Fireball.de, Google.de und Lycos.de , 2002, ISI.

[27]  Peter Bailey,et al.  Measuring Search Engine Quality , 2001, Information Retrieval.