Investigating the precision of Web image search engines for popular and less popular entities

Image search is the second most frequently used search service on the Web. However, there are very few studies investigating any aspect of it. In this study, we investigate the precision of Web image search engines of Google and Bing for popular and less popular entities using text-based queries. Furthermore, we investigate four additional aspects of Web image search engines that have not been studied before. We used 60 different queries in total from three different domains for popular and less popular categories. We examined the relevancy of the top 100 images for each query. Our results indicate that image search is a solved problem for popular entities. They deliver 97% precision on the average for popular entities. However, precision values are much lower for less popular entities. For the top 100 results, average precision is 48% for Google and 33% for Bing. The most important problem seems to be the worst cases in which the precision can be less than 10%. The results show that significant improvement is needed to better identify relevant images for less popular entities. One of the main issues is the association problem. When a Web page has query words and multiple images, both Google and Bing are having difficulty determining the relevant images.

[1]  Berthier A. Ribeiro-Neto,et al.  Image retrieval using multiple evidence ranking , 2004, IEEE Transactions on Knowledge and Data Engineering.

[2]  Wei-Ying Ma,et al.  Clustering and searching WWW images using link and page layout analysis , 2007, TOMCCAP.

[3]  Sanjib Kumar Deka,et al.  Performance evaluation and comparison of the five most used search engines in retrieving web resources , 2010, Online Inf. Rev..

[4]  Y. Bitirim,et al.  An Evaluation of Major Image Search Engines on Various Query Topics , 2008, 2008 The Third International Conference on Internet Monitoring and Protection.

[5]  Eleftherios Kayafas,et al.  Vehicle Logo Recognition Using a SIFT-Based Enhanced Matching Scheme , 2010, IEEE Transactions on Intelligent Transportation Systems.

[6]  Rashid Ali,et al.  An overview of Web search evaluation methods , 2011, Comput. Electr. Eng..

[7]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[8]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[9]  Md. Monirul Islam,et al.  A review on automatic image annotation techniques , 2012, Pattern Recognit..

[10]  Zhiguo Gong,et al.  Web image indexing by using associated texts , 2005, Knowledge and Information Systems.

[11]  Ahmet Uyar,et al.  Investigation of the accuracy of search engine hit counts , 2009, J. Inf. Sci..

[12]  Aya Soffer,et al.  PicASHOW: pictorial authority search by hyperlinks on the Web , 2001, WWW '01.

[13]  Dirk Lewandowski,et al.  Evaluating the retrieval effectiveness of web search engines using a representative query sample , 2014, J. Assoc. Inf. Sci. Technol..

[14]  Michael D. Gordon,et al.  Finding Information on the World Wide Web: The Retrieval Effectiveness of Search Engines , 1999, Inf. Process. Manag..

[15]  Theodosios Pavlidis Why meaningful automatic tagging of images is very hard , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[16]  Hong Zhao,et al.  Research on the Text Detection and Extraction from Complex Images , 2013, 2013 Fourth International Conference on Emerging Intelligent Data and Web Technologies.

[17]  Peter Bailey,et al.  Measuring Search Engine Quality , 2001, Information Retrieval.

[18]  Itheri Yahiaoui,et al.  Interactive plant identification based on social image data , 2014, Ecol. Informatics.

[19]  Amanda Spink,et al.  Searching for people on Web search engines , 2004, J. Documentation.