Coupled Dictionary Learning with Common Label Alignment for Cross-Modal Retrieval

Cross-modal retrieval has been an active research topic in recent years. However, most existing methods ignored discovering the common semantic relationship among different modalities so as to seriously reduce the retrieval accuracy. To cope with this problem, we propose a novel cross-modal retrieval method based on coupled dictionary learning with common label alignment. Concretely, our method first conducts coupled dictionary learning on the data from different modalities separately and then projects them into a common space, where the correlation between these modalities is encouraged by using common label alignment. Experimental results on two public datasets demonstrate that our method outperforms several state-of-the-art methods.

[1]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[2]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[3]  Yu-Chiang Frank Wang,et al.  Coupled Dictionary and Feature Space Learning with Applications to Cross-Domain Image Synthesis and Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[4]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[5]  D. Jacobs,et al.  Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch , 2011, CVPR 2011.

[6]  Joshua B. Tenenbaum,et al.  Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[7]  Wei Wang,et al.  Continuum regression for cross-modal multimedia retrieval , 2012, 2012 19th IEEE International Conference on Image Processing.

[8]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[9]  Yueting Zhuang,et al.  Supervised Coupled Dictionary Learning with Group Structures for Multi-modal Retrieval , 2013, AAAI.

[10]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[11]  David W. Jacobs,et al.  Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[13]  Quan Pan,et al.  Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.