Classifier Fusion for SVM-Based Multimedia Semantic Indexing

Concept indexing in multimedia libraries is very useful for users searching and browsing but it is a very challenging research problem as well. Combining several modalities, features or concepts is one of the key issues for bridging the gap between signal and semantics. In this paper, we present three fusion schemes inspired from the classical early and late fusion schemes. First, we present a kernel-based fusion scheme which takes advantage of the kernel basis of classifiers such as SVMs. Second, we integrate a new normalization process into the early fusion scheme. Third, we present a contextual late fusion scheme to merge classification scores of several concepts. We conducted experiments in the framework of the official TRECVID'06 evaluation campaign and we obtained significant improvements with the proposed fusion schemes relatively to usual fusion schemes.

[1]  Stéphane Ayache,et al.  CLIPS-LSR-NII Experiments at TRECVID 2005 ( DRAFT ) , .

[2]  Jean-Michel Renders,et al.  Word-Sequence Kernels , 2003, J. Mach. Learn. Res..

[3]  Wei-Ying Ma,et al.  Image and Video Retrieval , 2003, Lecture Notes in Computer Science.

[4]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[5]  Nozha Boujemaa,et al.  Conditionally Positive Definite Kernels for SVM Based Image Recognition , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[6]  Emine Yilmaz,et al.  Estimating average precision with incomplete and imperfect judgments , 2006, CIKM '06.

[7]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[8]  Harriet J. Nock,et al.  Discriminative model fusion for semantic concept detection and annotation in video , 2003, ACM Multimedia.

[9]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[10]  Nello Cristianini,et al.  Kernel-Based Data Fusion and Its Application to Protein Function Prediction in Yeast , 2003, Pacific Symposium on Biocomputing.

[11]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[12]  Marcel Worring,et al.  The Semantic Pathfinder for Generic News Video Indexing , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[13]  Stéphane Ayache,et al.  Using Topic Concepts for Semantic Video Shots Classification , 2006, CIVR.

[14]  Stéphane Ayache,et al.  CLIPS-LSR-NII Experiments at TRECVID 2005 , 2005, TRECVID.

[15]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[16]  Alexander Hauptmann,et al.  Meta-Classification of Multimedia Classifiers , 2002, KDMCD.

[17]  Gunnar Rätsch,et al.  A General and Efficient Multiple Kernel Learning Algorithm , 2005, NIPS.

[18]  Milind R. Naphade On supervision and statistical learning for semantic multimedia analysis , 2004, J. Vis. Commun. Image Represent..