Learning semantic distance from community-tagged media collection

This paper proposes a novel semantic-aware distance metric for images by mining multimedia data on the Internet, in particular, web images and their associated tags. As well known, a proper distance metric between images is a key ingredient in many realistic web image retrieval engines, as well many image understanding techniques. In this paper, we attempt to mine a novel distance metric from the web images by integrating their visual content as well as the associated user tags. Different from many existing distance metric learning algorithms which utilize the dissimilar or similar information between images pixels or features in signal level, the proposed scheme also takes the associated user-input tags into consideration. The visual content of images is also leveraged to respect an intuitive assumption that the visual similar images ought to have a smaller distance. A semi-definite programming is formulated to encode the above two aspects of criteria to learn the distance metric and we show such an optimization problem can be efficiently solved with a closed-form solution. We evaluate the proposed algorithm on two datasets. One is the benchmark Corel dataset and the other is a real-world dataset crawled from the image sharing website Flickr. By comparison with other existing distance learning algorithms, competitive results are obtained by the proposed algorithm in experiments.

[1]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[2]  Wei-Ying Ma,et al.  AnnoSearch: Image Auto-Annotation by Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[4]  Wei Liu,et al.  Semi-supervised distance metric learning for Collaborative Image Retrieval , 2008, CVPR.

[5]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[6]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[7]  Qi Tian,et al.  Semantic Subspace Projection and Its Applications in Image Retrieval , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[10]  Tomer Hertz,et al.  Learning Distance Functions using Equivalence Relations , 2003, ICML.

[11]  Xian-Sheng Hua,et al.  Two-Dimensional Active Learning for image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[13]  I. Tsang,et al.  Kernel relevant component analysis for distance metric learning , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[14]  Jing Hua,et al.  Graph theoretical framework for simultaneously integrating visual and textual features for efficient web image clustering , 2008, WWW.

[15]  Kilian Q. Weinberger,et al.  Resolving tag ambiguity , 2008, ACM Multimedia.

[16]  A. Farag,et al.  Kernel methods for statistical learning in computer vision and pattern recognition applications , 2005 .

[17]  Changhu Wang,et al.  Learning to reduce the semantic gap in web image retrieval and annotation , 2008, SIGIR '08.

[18]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[19]  Luo Si,et al.  Collaborative image retrieval via regularized metric learning , 2006, Multimedia Systems.

[20]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[21]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[22]  Philip S. Yu,et al.  Unsupervised learning on k-partite graphs , 2006, KDD '06.

[23]  Xian-Sheng Hua,et al.  Two-Dimensional Multilabel Active Learning with an Efficient Online Adaptation Model for Image Classification , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[25]  Thorsten Joachims,et al.  Learning a Distance Metric from Relative Comparisons , 2003, NIPS.