Multi-Modality Transfer Based on Multi-Graph Optimization for Domain Adaptive Video Concept Annotation

Multi-modality, the unique and important property of video data, is typically ignored in existing video adaptation processes. To solve this problem, we propose a novel approach, named multi-modality transfer based on multi- graph optimization (MMT-MGO) in this paper, which leverages multi-modality knowledge generalized by auxiliary classifiers in the source domain to assist multi-graph optimization (a graph-based semi-supervised learning method) in the target domain for video concept annotation. To our best knowledge, it is the first time to introduce multi-modality transfer into domain adaptive video concept detection and annotation. Moreover, we propose an efficient incremental extension scheme to sequentially estimate a small batch of new emerging data without modifying the structure of multi-graph scheme. The proposed scheme can achieve a comparable accuracy with that of the brand-new round optimization which combines these data with the data corpus for the nearest round optimization, while the time for estimation has been greatly reduced. Extensive experiments over TRECVID2005 and 2007 data sets demonstrate the effectiveness of both the multi-modality transfer scheme and the incremental extension scheme.

[1]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[2]  Wei-Ying Ma,et al.  Graph based multi-modality learning , 2005, ACM Multimedia.

[3]  Sheng Tang,et al.  TRECVID 2007 High-Level Feature Extraction By MCG-ICT-CAS , 2007, TRECVID.

[4]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[5]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Clustering via the SocialWeb , 2009, ACL.

[6]  Meng Wang,et al.  Optimizing multi-graph learning: towards a unified video annotation scheme , 2007, ACM Multimedia.

[7]  Ming Li,et al.  Online Manifold Regularization: A New Learning Setting and Empirical Study , 2008, ECML/PKDD.

[8]  Tat-Seng Chua,et al.  Exploring large scale data for multimedia QA: an initial study , 2010, CIVR '10.

[9]  Jian Pei,et al.  Clustering by Pattern Similarity , 2008, Journal of Computer Science and Technology.

[10]  Ivor W. Tsang,et al.  Using large-scale web data to facilitate textual query based retrieval of consumer photos , 2009, MM '09.

[11]  David B. Dunson,et al.  Hierarchical kernel stick-breaking process for multi-task image analysis , 2008, ICML '08.

[12]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[13]  Juan Cao,et al.  Large scale incremental web video categorization , 2009, WSMC '09.

[14]  Shuicheng Yan,et al.  Inferring semantic concepts from community-contributed images and noisy tags , 2009, ACM Multimedia.

[15]  Chong-Wah Ngo,et al.  Semantic context transfer across heterogeneous sources for domain adaptive video search , 2009, ACM Multimedia.

[16]  Shih-Fu Chang,et al.  Cross-domain learning methods for high-level visual concept classification , 2008, 2008 15th IEEE International Conference on Image Processing.

[17]  Ivor W. Tsang,et al.  Domain adaptation from multiple sources via auxiliary classifiers , 2009, ICML '09.

[18]  Shih-Fu Chang,et al.  Columbia University’s Baseline Detectors for 374 LSCOM Semantic Visual Concepts , 2007 .

[19]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[20]  Rong Yan,et al.  Cross-domain video concept detection using adaptive svms , 2007, ACM Multimedia.

[21]  Hung-Khoon Tan,et al.  Event driven summarization for web videos , 2009, WSM '09.

[22]  Trevor Darrell,et al.  Transfer learning for image classification with sparse prototype representations , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Xiaogang Wang,et al.  Boosted multi-task learning for face verification with applications to web image and video search , 2009, CVPR.

[24]  Ivor W. Tsang,et al.  Domain Transfer SVM for video concept detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.