An LLE based Heterogeneous Metric Learning for Cross-media Retrieval

With unstructured heterogeneous multimedia data such as texts, images being more and more widely used on the web, cross-media retrieval has become an increasingly important task. One of the key techniques in cross-media retrieval is how to compute distances or similarities among different types of media data. In this paper, we propose a novel heterogeneous metric learning method to compute distances between images and texts. We extend Locally Linear Embedding (LLE) to deal with heterogeneous data, so that we can not only preserve homogeneous local information but also capture heterogeneous constraints. In order to handle the out-of-sample problem, we learn two map functions from the embedding, and use them to transform heterogeneous data into a homogeneous space and do the retrieval in the new space. The experimental results on two real-world datasets show the effectiveness of our approach.

[1]  Li Chen,et al.  Learning optimal data representation for cross-media retrieval , 2012, 2012 19th IEEE International Conference on Image Processing.

[2]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3]  Hang Li,et al.  Learning Similarity Function between Objects in Heterogeneous Spaces , 2010 .

[4]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[5]  Chris H. Q. Ding,et al.  Spectral Relaxation for K-means Clustering , 2001, NIPS.

[6]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[7]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[8]  Yuxin Peng,et al.  Clip-based similarity measure for query-dependent clip retrieval and video summarization , 2006, IEEE Trans. Circuits Syst. Video Technol..

[9]  Hayit Greenspan,et al.  Probabilistic space-time video modeling via piecewise GMM , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Tomer Hertz,et al.  Learning a Mahalanobis Metric from Equivalence Constraints , 2005, J. Mach. Learn. Res..

[11]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[12]  Yixin Chen,et al.  Kernel Density Metric Learning , 2013, 2013 IEEE 13th International Conference on Data Mining.

[13]  D. D. Ridder,et al.  Locally linear embedding for classification , 2002 .

[14]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[15]  Raghavendra Udupa,et al.  Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.

[16]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[17]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[18]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[19]  Wei Liu,et al.  Semi-supervised distance metric learning for Collaborative Image Retrieval , 2008, CVPR.

[20]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[21]  Xiaohua Zhai,et al.  Heterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval , 2013, AAAI.

[22]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[23]  Nikos Paragios,et al.  Data fusion through cross-modality metric learning using similarity-sensitive hashing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Chunhua Shen,et al.  Efficiently Learning a Distance Metric for Large Margin Nearest Neighbor Classification , 2011, AAAI.

[25]  Ishwar K. Sethi,et al.  Multimedia content processing through cross-modal association , 2003, MULTIMEDIA '03.

[26]  Zi Huang,et al.  Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.