Cross-Domain Concept Detection with Dictionary Coherence by Leveraging Web Images

We propose a novel scheme to address video concept learning by leveraging social media, one that includes the selection of web training data and the transfer of subspace learning within a unified framework. Due to the existence of cross-domain incoherence resulting from the mismatch of data distributions, how to select sufficient positive training samples from scattered and diffused social media resources is a challenging problem in the training of effective concept detectors. In this paper, given a concept, the coherent positive samples from web images for further concept learning are selected based on the degree of image coherence. Then, by exploiting both the selected dataset and video keyframes, we train a robust concept classifier by means of a transfer subspace learning method. Experiment results demonstrate that the proposed approach can achieve constant overall improvement despite cross-domain incoherence.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Bart Thomee,et al.  New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative , 2010, MIR '10.

[3]  Qi Tian,et al.  Cross-database age estimation based on transfer learning , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Yukinobu Taniguchi,et al.  A novel region-based approach to visual concept modeling using web images , 2008, ACM Multimedia.

[5]  Tao Mei,et al.  To construct optimal training set for video annotation , 2006, MM '06.

[6]  Akira Kojima,et al.  A novel method for semantic video concept learning using web images , 2011, MM '11.

[7]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[8]  Gang Wang,et al.  On the sampling of web images for learning visual concept classifiers , 2010, CIVR '10.

[9]  Sheng Tang,et al.  TRECVID 2007 High-Level Feature Extraction By MCG-ICT-CAS , 2007, TRECVID.

[10]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[11]  Paul Over,et al.  TRECVID 2008 - Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2010, TRECVID.

[12]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[13]  Chao Liang,et al.  IVA-NLPR-IA-CAS TRECVID 2009: High Level Features Extraction , 2009 .

[14]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[15]  G. Baudat,et al.  Feature vector selection and projection using kernels , 2003, Neurocomputing.

[16]  Adrian Ulges,et al.  Automatic concept-to-query mapping for web-based concept detector training , 2011, ACM Multimedia.

[17]  Rong Yan,et al.  Cross-domain video concept detection using adaptive svms , 2007, ACM Multimedia.

[18]  Chong-Wah Ngo,et al.  Columbia University/VIREO-CityU/IRIT TRECVID2008 High-Level Feature Extraction and Interactive Video Search , 2008, TRECVID.

[19]  Guillermo Sapiro,et al.  Classification and clustering via dictionary learning with structured incoherence and shared features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Pietro Perona,et al.  Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[21]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[22]  Chong-Wah Ngo,et al.  Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study , 2010, IEEE Transactions on Multimedia.

[23]  Mor Naaman,et al.  Generating diverse and representative image search results for landmarks , 2008, WWW.

[24]  Sheng Tang,et al.  Sparse Ensemble Learning for Concept Detection , 2012, IEEE Transactions on Multimedia.