论文信息 - A Feature Sequence Kernel for Video Concept Classification

A Feature Sequence Kernel for Video Concept Classification

Kernel methods such as Support Vector Machines are widely applied to classification problems, including concept detection in video. Nonetheless issues like modeling specific distance functions of feature descriptors or the temporal sequence of features in the kernel have received comparatively little attention in multimedia research. We review work on kernels for commonly used MPEG-7 visual features and propose a kernel for matching temporal sequences of these features. The sequence kernel is based on ideas from string matching, but does not require discretization of the input feature vectors and deals with partial matches and gaps. Evaluation on the TRECVID 2007 high-level feature extraction data set shows that the sequence kernel clearly outperforms the radial basis function (RBF) kernel and the MPEG-7 visual feature kernels using only single key frames.

Werner Bailer | W. Bailer

[1] B. S. Manjunath,et al. Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[2] Wessel Kraaij,et al. TRECVID-2009 high-level feature task: Overview (slides0 , 2005 .

[3] P. Beek,et al. Text of 15938-5 FCD Information Technology-Multimedia Content Description Interface-Pard 5 Multimedia Description Schemes , 2001 .

[4] Mei-Chen Yeh,et al. A string matching approach for visual retrieval and classification , 2008, MIR '08.

[5] Paul Over,et al. Evaluation campaigns and TRECVid , 2006, MIR '06.

[6] Trevor Darrell,et al. The Pyramid Match Kernel: Efficient Learning with Sets of Features , 2007, J. Mach. Learn. Res..

[7] Stéphane Ayache,et al. TRECVID 2007: Collaborative Annotation using Active Learning , 2007, TRECVID.

[8] Won Jong Jeon,et al. Spatio-temporal pyramid matching for sports videos , 2008, MIR '08.

[9] Yiannis Kompatsiaris,et al. K-Space at TRECvid 2006 , 2006, TRECVID.

[10] Edward Y. Chang,et al. Multi-camera spatio-temporal fusion and biased sequence-data learning for security surveillance , 2003, MULTIMEDIA '03.

[11] Meng Wang,et al. Correlative multilabel video annotation with temporal kernels , 2008, TOMCCAP.