论文信息 - Learning semantic embedding at a large scale

Learning semantic embedding at a large scale

A key problem in image annotation is to learn the underlying semantics. However, finding such semantic embeddings is a challenge task and often requires large amount of tagging information. In this paper, we propose to utilize multi-modality cues by incorporating visual and textual information as embedded objects. The paper further presents a multi-task learning framework that simultaneously learns the approximation of two semantic embeddings with efficient multi-stage convex relaxation technique. The experiments show that the proposed method presents very promising performance in both memory usage and training time for large-scale dataset, as well as image classification accuracy.

Yihong Gong | Thomas S. Huang | Jinjun Wang | Tong Zhang | Min-Hsuan Tsai

[1] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[2] Daniel Gatica-Perez,et al. Tagging and retrieving images with co-occurrence models: from corel to flickr , 2009, LS-MMRM '09.

[3] Alin Achim,et al. 18th IEEE International Conference on Image Processing, ICIP 2011, Brussels, Belgium, September 11-14, 2011 , 2011, ICIP.

[4] Y. Mori,et al. Image-to-word transformation based on dividing and vector quantizing images with words , 1999 .

[5] Christopher K. I. Williams. On a Connection between Kernel PCA and Metric Multidimensional Scaling , 2004, Machine Learning.

[6] Jitendra Malik,et al. SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7] Shuicheng Yan,et al. Graph embedding: a general framework for dimensionality reduction , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8] David A. Forsyth,et al. Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[9] Tong Zhang,et al. Analysis of Multi-stage Convex Relaxation for Sparse Regularization , 2010, J. Mach. Learn. Res..

[10] Eli Shechtman,et al. In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Rong Jin,et al. Distance Metric Learning: A Comprehensive Survey , 2006 .

[12] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13] Chong Wang,et al. Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14] Zhen Li,et al. Hierarchical Gaussianization for image classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.