Web image clustering by consistent utilization of visual features and surrounding texts

Image clustering, an important technology for image processing, has been actively researched for a long period of time. Especially in recent years, with the explosive growth of the Web, image clustering has even been a critical technology to help users digest the large amount of online visual information. However, as far as we know, many previous works on image clustering only used either low-level visual features or surrounding texts, but rarely exploited these two kinds of information in the same framework. To tackle this problem, we proposed a novel method named consistent bipartite graph co-partitioning in this paper, which can cluster Web images based on the consistent fusion of the information contained in both low-level features and surrounding texts. In particular, we formulated it as a constrained multi-objective optimization problem, which can be efficiently solved by semi-definite programming (SDP). Experiments on a real-world Web image collection showed that our proposed method outperformed the methods only based on low-level features or surround texts.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[3]  Gene H. Golub,et al.  Matrix computations , 1983 .

[4]  Anil K. Jain,et al.  Texture classification and segmentation using multiresolution simultaneous autoregressive models , 1992, Pattern Recognit..

[5]  Andrew B. Kahng,et al.  New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[6]  C.-C. Jay Kuo,et al.  Texture analysis and classification with tree-structured wavelet transform , 1993, IEEE Trans. Image Process..

[7]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  S. Sclaroff,et al.  Combining textual and visual cues for content-based image retrieval on the World Wide Web , 1998, Proceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173).

[9]  Chris H. Q. Ding,et al.  Bipartite graph partitioning and data clustering , 2001, CIKM '01.

[10]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[11]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[12]  Kerry Rodden,et al.  Does organisation by similarity assist image browsing? , 2001, CHI.

[13]  William I. Grosky,et al.  Narrowing the semantic gap - improved text-based web document retrieval using visual features , 2002, IEEE Trans. Multim..

[14]  H. P. Benson,et al.  Global Optimization Algorithm for the Nonlinear Sum of Ratios Problem , 2002 .

[15]  Yixin Chen,et al.  Content-based image retrieval by clustering , 2003, MIR '03.

[16]  Michael I. Jordan,et al.  Learning Spectral Clustering , 2003, NIPS.

[17]  Shiri Gordon,et al.  Applying the information bottleneck principle to unsupervised clustering of discrete and continuous image representations , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[18]  Wei-Ying Ma,et al.  Hierarchical clustering of WWW image search results using visual, textual and link information , 2004, MULTIMEDIA '04.

[19]  Alex Alves Freitas,et al.  A critical review of multi-objective optimization in data mining: a position paper , 2004, SKDD.

[20]  Guoping Qiu Image and feature co-clustering , 2004, ICPR 2004.

[21]  Wei-Ying Ma,et al.  Organizing WWW images based on the analysis of page layout and Web link structure , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[22]  Wei-Ying Ma,et al.  Grouping WWW Image Search Results by Novel Inhomogeneous Clustering Method , 2005, 11th International Multimedia Modelling Conference.

[23]  Tie-Yan Liu,et al.  Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering , 2005, KDD '05.

[24]  Tao Qin,et al.  Hierarchical taxonomy preparation for text categorization using consistent bipartite spectral graph copartitioning , 2005, IEEE Transactions on Knowledge and Data Engineering.

[25]  S. Dumais Latent Semantic Analysis. , 2005 .

[26]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.