Query Difficulty Prediction for Web Image Search

Image search plays an important role in our daily life. Given a query, the image search engine is to retrieve images related to it. However, different queries have different search difficulty levels. For some queries, they are easy to be retrieved (the search engine can return very good search results). While for others, they are difficult (the search results are very unsatisfactory). Thus, it is desirable to identify those “difficult” queries in order to handle them properly. Query difficulty prediction (QDP) is an attempt to predict the quality of the search result for a query over a given collection. QDP problem has been investigated for many years in text document retrieval, and its importance has been recognized in the information retrieval (IR) community. However, little effort has been conducted on the image query difficulty prediction problem for image search. Compared with QDP in document retrieval, QDP in image search is more challenging due to the noise of textual features and the well-known semantic gap of visual features. This paper aims to investigate the QDP problem in Web image search. A novel method is proposed to automatically predict the quality of image search results for an arbitrary query. This model is built based on a set of valuable features that are designed by exploring the visual characteristic of images in the search results. The experiments on two real image search datasets demonstrate the effectiveness of the proposed query difficulty prediction method. Two applications, including optimal image search engine selection and search results merging, are presented to show the promising applicability of QDP.

[1]  Javed A. Aslam,et al.  Query Hardness Estimation Using Jensen-Shannon Divergence Among Multiple Scoring Functions , 2007, ECIR.

[2]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[3]  Jonathan L. Herlocker,et al.  A collaborative filtering algorithm and evaluation metric that accurately model the user experience , 2004, SIGIR '04.

[4]  M. Kendall,et al.  Rank Correlation Methods , 1949 .

[5]  Alan Hanjalic,et al.  Learning from search engine and human supervision for web image search , 2011, MM '11.

[6]  Rong Yan,et al.  Multimedia Search with Pseudo-relevance Feedback , 2003, CIVR.

[7]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[8]  Fei-Fei Li,et al.  What Does Classifying More Than 10, 000 Image Categories Tell Us? , 2010, ECCV.

[9]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[10]  Alan Hanjalic,et al.  Supervised reranking for web image search , 2010, ACM Multimedia.

[11]  Ricardo Baeza-Yates,et al.  Improved query difficulty prediction for the web , 2008, CIKM '08.

[12]  Frédéric Jurie,et al.  Improving web image search results using query-relative classifiers , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Xian-Sheng Hua,et al.  Transductive video annotation via local learnable kernel classifier , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[14]  Aditi Sharan,et al.  Co-occurrence based predictors for estimating query difficulty , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[15]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[16]  Yiming Yang,et al.  Translingual Information Retrieval: A Comparative Evaluation , 1997, IJCAI.

[17]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[18]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[20]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[21]  Xian-Sheng Hua,et al.  Towards a Relevant and Diverse Search of Social Images , 2010, IEEE Transactions on Multimedia.

[22]  W. Bruce Croft,et al.  Query performance prediction in web search environments , 2007, SIGIR.

[23]  Claudio Carpineto,et al.  Query Difficulty, Robustness, and Selective Application of Query Expansion , 2004, ECIR.

[24]  M. de Rijke,et al.  Using Coherence-Based Measures to Predict Query Difficulty , 2008, ECIR.

[25]  Shih-Fu Chang,et al.  Video search reranking through random walk over document-level context graph , 2007, ACM Multimedia.

[26]  Yi Zhang,et al.  Query Difficulty Prediction for Contextual Image Retrieval , 2010, ECIR.

[27]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[28]  R. Forthofer,et al.  Rank Correlation Methods , 1981 .

[29]  Kui-Lam Kwok,et al.  TREC 2004 Robust Track Experiments Using PIRCS , 2004, TREC.

[30]  Elad Yom-Tov,et al.  Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval , 2005, SIGIR '05.

[31]  Yong Luo,et al.  Query Difficulty Guided Image Retrieval System , 2011, MMM.

[32]  Pratibha Mishra,et al.  Advanced Engineering Mathematics , 2013 .

[33]  Meng Wang,et al.  Unified Video Annotation via Multigraph Learning , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[34]  Yun Fu,et al.  Image-Based Human Age Estimation by Manifold Learning and Locally Adjusted Robust Regression , 2008, IEEE Transactions on Image Processing.

[35]  Xian-Sheng Hua,et al.  Active Reranking for Web Image Search , 2010, IEEE Transactions on Image Processing.

[36]  Iadh Ounis,et al.  Inferring Query Performance Using Pre-retrieval Predictors , 2004, SPIRE.

[37]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[38]  Meng Wang,et al.  Dynamic captioning: video accessibility enhancement for hearing impairment , 2010, ACM Multimedia.

[39]  Stevan Rudinac,et al.  Exploiting Result Consistency to Select Query Expansions for Spoken Content Retrieval , 2010, ECIR.

[40]  Ophir Frieder,et al.  Predicting query difficulty on the web by learning visual clues , 2005, SIGIR '05.

[41]  W. Bruce Croft,et al.  Ranking robustness: a novel framework to predict query performance , 2006, CIKM '06.

[42]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[43]  Meng Wang,et al.  Correlative Linear Neighborhood Propagation for Video Annotation , 2009, IEEE Trans. Syst. Man Cybern. Part B.

[44]  Shuicheng Yan,et al.  Inferring semantic concepts from community-contributed images and noisy tags , 2009, ACM Multimedia.

[45]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.

[46]  Elad Yom-Tov,et al.  What makes a query difficult? , 2006, SIGIR.

[47]  Subhabrata Chakraborti,et al.  Nonparametric Statistical Inference , 2011, International Encyclopedia of Statistical Science.

[48]  Shumeet Baluja,et al.  VisualRank: Applying PageRank to Large-Scale Image Search , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Xian-Sheng Hua,et al.  Bayesian video search reranking , 2008, ACM Multimedia.

[50]  Zhi-Hua Zhou,et al.  Automatic Age Estimation Based on Facial Aging Patterns , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Xian-Sheng Hua,et al.  Content-aware Ranking for visual search , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[52]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[53]  Tat-Seng Chua,et al.  Image Annotation by Graph-Based Inference With Integrated Multiple/Single Instance Representations , 2010, IEEE Transactions on Multimedia.

[54]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .