论文信息 - Ensemble multi-instance multi-label learning approach for video annotation task

Ensemble multi-instance multi-label learning approach for video annotation task

Automatic video annotation is an important ingredient for video indexing, browsing, and retrieval. Traditional studies represent one video clip with a flat feature vector; however, video data usually has natural structure. Moreover, a video clip is generally relevant to multiple concepts. Indeed, the video annotation task is inherently a Multi-Instance Multi-Label (MIML) learning problem. In this paper, we propose the En-MIMLSVM approach for the video annotation task. It considers the class imbalance and long time training problems of most video annotation tasks. In addition, a temporally consistent weighted multi-instance kernel is developed to take into account both the temporal consistency in video data and the significance of instances of different levels in pyramid representation. The En-MIMLSVM is evaluated on TRECVID 2005 data set, and the results show that it outperforms several state-of-the-art methods.

[1] Zhi-Hua Zhou,et al. Exploratory Undersampling for Class-Imbalance Learning , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[2] Tao Mei,et al. Temporally Consistent Gaussian Random Field for Video Semantic Analysis , 2007, 2007 IEEE International Conference on Image Processing.

[3] Tao Mei,et al. Joint multi-label multi-instance learning for image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4] John R. Smith,et al. A generalized multiple instance learning algorithm for large scale modeling of multimedia semantics , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[5] Jun Yang,et al. Exploring temporal consistency for video analysis and retrieval , 2006, MIR '06.

[6] Trevor Darrell,et al. The Pyramid Match Kernel: Efficient Learning with Sets of Features , 2007, J. Mach. Learn. Res..

[7] Thomas Hofmann,et al. Multi-Instance Multi-Label Learning with Application to Scene Classification , 2007 .

[8] Thomas S. Huang,et al. Factor graph framework for semantic video indexing , 2002, IEEE Trans. Circuits Syst. Video Technol..

[9] Zhi-Hua Zhou,et al. Multi-instance learning by treating instances as non-I.I.D. samples , 2008, ICML '09.

[10] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[11] Thomas G. Dietterich,et al. Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[12] John R. Smith,et al. IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[13] Marcel Worring,et al. The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[14] John R. Smith,et al. Multimedia semantic indexing using model vectors , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[15] Thomas Gärtner,et al. Multi-Instance Kernels , 2002, ICML.

[16] Nam Nguyen,et al. A New SVM Approach to Multi-instance Multi-label Learning , 2010, 2010 IEEE International Conference on Data Mining.

[17] Sanjeev Khudanpur,et al. Hidden Markov models for automatic annotation and content-based retrieval of images and video , 2005, SIGIR '05.

[18] Koen E. A. van de Sande,et al. Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Tao Mei,et al. Multi-Layer Multi-Instance Learning for Video Concept Detection , 2008, IEEE Transactions on Multimedia.

[20] Meng Wang,et al. Correlative multilabel video annotation with temporal kernels , 2008, TOMCCAP.

[21] Zhi-Hua Zhou,et al. M3MIML: A Maximum Margin Method for Multi-instance Multi-label Learning , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[22] Nikunj C. Oza,et al. Online Ensemble Learning , 2000, AAAI/IAAI.

[23] Zhi-Hua Zhou,et al. Learning a distance metric from multi-instance multi-label data , 2009, CVPR.

[24] Shuang-Hong Yang,et al. Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instance Corpora , 2009, NIPS.

[25] Zhi-Hua Zhou,et al. MIML: A Framework for Learning with Ambiguous Objects , 2008, ArXiv.