An Interactive Approach for Filtering Out Junk Images From Keyword-Based Google Search Results

The keyword-based Google images search engine is now becoming very popular for online image search. Unfortunately, only the text terms that are explicitly or implicitly linked with the images are used for image indexing but the associated text terms may not have exact correspondence with the underlying image semantics, thus the keyword-based Google images search engine may return large amounts of junk images which are irrelevant to the given keyword-based queries. Based on this observation, we have developed an interactive approach to filter out the junk images from keyword-based Google images search results and our approach consists of the following major components. a) A kernel-based image clustering technique is developed to partition the returned images into multiple clusters and outliers. b) Hyperbolic visualization is incorporated to display large amounts of returned images according to their nonlinear visual similarity contexts, so that users can assess the relevance between the returned images and their real query intentions interactively and select one or multiple images to express their query intentions and personal preferences precisely. c) An incremental kernel learning algorithm is developed to translate the users' query intentions and personal preferences for updating the mixture-of-kernels and generating better hypotheses to achieve more accurate clustering of the returned images and filter out the junk images more effectively. Experiments on diverse keyword-based queries from Google images search engine have obtained very positive results. Our junk image filtering system is released for public evaluation at: http://www.cs.uncc.edu/~jfan/google-demo/.

[1]  Thomas S. Huang,et al.  One-class SVM for learning in image retrieval , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[2]  Thomas S. Huang,et al.  Small sample learning during multimedia retrieval using BiasMap , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[3]  Ben Shneiderman,et al.  Hierarchical Layouts for Photo Libraries , 2006, IEEE MultiMedia.

[4]  Zhongfei Zhang,et al.  Hidden semantic concept discovery in region based image retrieval , 2004, CVPR 2004.

[5]  Kerry Rodden,et al.  Does organisation by similarity assist image browsing? , 2001, CHI.

[6]  B. S. Manjunath,et al.  Texture features and learning similarity , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Wei-Ying Ma,et al.  Learning and inferring a semantic space from user's relevance feedback for image retrieval , 2002, MULTIMEDIA '02.

[8]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[9]  Hava T. Siegelmann,et al.  Support Vector Clustering , 2002, J. Mach. Learn. Res..

[10]  Luc Van Gool,et al.  Modeling scenes with local descriptors and latent aspects , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[11]  Jörg A. Walter,et al.  Interactive Hyperbolic Image Browsing – Towards an Integrated Multimedia Navigator , 2006 .

[12]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[13]  Wei-Ying Ma,et al.  Multi-model similarity propagation and its application for web image retrieval , 2004, MULTIMEDIA '04.

[14]  Jianping Fan,et al.  A novel approach to enable semantic and visual image summarization for exploratory image search , 2008, MIR '08.

[15]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[16]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[17]  Stefan Rüger,et al.  NNk networks and automated annotation for browsing large image collections from the world wide web , 2006, MM '06.

[18]  Jianping Fan,et al.  A Novel Approach for Filtering Junk Images from Google Search Results , 2008, MMM.

[19]  Mark A. Girolami,et al.  Mercer kernel-based clustering in feature space , 2002, IEEE Trans. Neural Networks.

[20]  Tao Qin,et al.  Web image clustering by consistent utilization of visual features and surrounding texts , 2005, MULTIMEDIA '05.

[21]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[22]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[23]  Yixin Chen,et al.  CLUE: cluster-based retrieval of images by unsupervised learning , 2005, IEEE Transactions on Image Processing.

[24]  Qi Tian,et al.  Visualization and User-Modeling for Browsing Personal Photo Libraries , 2004, International Journal of Computer Vision.

[25]  Edward Y. Chang,et al.  Multimodal concept-dependent active learning for image retrieval , 2004, MULTIMEDIA '04.

[26]  Jianping Fan,et al.  Multi-level annotation of natural scenes using dominant image components and semantic concepts , 2004, MULTIMEDIA '04.

[27]  Pietro Perona,et al.  Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[28]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[29]  Jiebo Luo,et al.  Scene Parsing Using Region-Based Generative Models , 2007, IEEE Transactions on Multimedia.

[30]  Xuelong Li,et al.  Direct kernel biased discriminant analysis: a new content-based image retrieval relevance feedback algorithm , 2006, IEEE Transactions on Multimedia.

[31]  Jianping Fan,et al.  Integrating Concept Ontology and Multitask Learning to Achieve More Effective Classifier Training for Multilevel Image Annotation , 2008, IEEE Transactions on Image Processing.

[32]  B. S. Manjunath,et al.  An efficient color representation for image retrieval , 2001, IEEE Trans. Image Process..

[33]  Jianping Fan,et al.  Mining Multilevel Image Semantics via Hierarchical Classification , 2008, IEEE Transactions on Multimedia.

[34]  Yves Grandvalet,et al.  More efficiency in multiple kernel learning , 2007, ICML '07.

[35]  Wei-Ying Ma,et al.  Hierarchical clustering of WWW image search results using visual, textual and link information , 2004, MULTIMEDIA '04.

[36]  Xuelong Li,et al.  Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[38]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[39]  Ishwar K. Sethi,et al.  eID: a system for exploration of image databases , 2003, Inf. Process. Manag..

[40]  Rong Jin,et al.  Learning nonparametric kernel matrices from pairwise constraints , 2007, ICML '07.

[41]  Jianping Fan,et al.  Statistical modeling and conceptualization of natural images , 2005, Pattern Recognit..

[42]  Pietro Perona,et al.  A Visual Category Filter for Google Images , 2004, ECCV.

[43]  Jianping Fan,et al.  Automatic image annotation by incorporating feature hierarchy and boosting to scale up SVM classifiers , 2006, MM '06.

[44]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[45]  Antonio Torralba,et al.  Semantic organization of scenes using discriminant structural templates , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[46]  Jianping Fan,et al.  JustClick: Personalized Image Recommendation via Exploratory Search From Large-Scale Flickr Images , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[47]  Andrew McCallum,et al.  Semi-Supervised Clustering with User Feedback , 2003 .