MILC2: A Multi-Layer Multi-Instance Learning Approach to Video Concept Detection

Video is a kind of structured data with multi-layer (ML) information, e.g., a shot is consisted of three layers including shot, keyframe, and region. Moreover, multi-instance (MI) relation is embedded along the consecutive layers. Both the ML structure and MI relation are essential for video concept detection. The previous work [5] dealt with ML structure and MI relation by constructing a MLMI kernel in which each layer is assumed to have equal contribution. However, such equal weighting technique cannot well model MI relation or handle ambiguity propagation problem, i.e., the propagation of uncertainty of sublayer label through multiple layers, as it has been proved that different layers have different contributions to the kernel. In this paper, we propose a novel algorithm named MILC2 (Multi-Layer Multi-Instance Learning with Inter-layer Consistency Constraint.) to tackle the ambiguity propagation problem, in which an inter-layer consistency constraint is explicitly introduced to measure the disagreement of inter-layers, and thus the MI relation is better modeled. This learning task is formulated in a regularization framework with three components including hyper-bag prediction error, inter-layer inconsistency measure, and classifier complexity. We apply the proposed MILC2 to video concept detection over TRECVID 2005 development corpus, and report better performance than both standard Support Vector Machine based and MLMI kernel methods.

[1]  Rong Yan,et al.  Semi-supervised cross feature learning for semantic concept detection in videos , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[2]  Tao Mei,et al.  Multi-layer multi-instance kernel for video concept detection , 2007, ACM Multimedia.

[3]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[4]  Thomas Hofmann,et al.  Kernel Methods for Missing Variables , 2005, AISTATS.

[5]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[6]  James T. Kwok,et al.  Marginalized Multi-Instance Kernels , 2007, IJCAI.

[7]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[8]  B. S. Manjunath,et al.  Unsupervised Segmentation of Color-Texture Regions in Images and Video , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[10]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[12]  James T. Kwok,et al.  A regularization framework for multiple-instance learning , 2006, ICML.