Combining multimodal preferences for multimedia information retrieval

Representing and fusing multimedia information is a key issue to discover semantics in multimedia. In this paper we address more specifically the problem of multimedia content retrieval by first defining a novel preference-based representation particularly adapted to the fusion problem, and then, by investigating the RankBoost algorithm to combine those preferences and a learn multimodal retrieval model. The approach has been tested on annotated images and on the complete TRECVID 2005 corpus and compared with SVM-based fusion strategies. The results show that our approach equals SVM performance but, contrary to SVM, is parameter free and faster.

[1]  Thomas S. Huang,et al.  A Discussion of Nonlinear Variants of Biased Discriminants for Interactive Image Retrieval , 2004, CIVR.

[2]  Rong Yan,et al.  Negative pseudo-relevance feedback in content-based video retrieval , 2003, MULTIMEDIA '03.

[3]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[4]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[5]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[6]  Robert P. W. Duin,et al.  The combining classifier: to train or not to train? , 2002, Object recognition supported by user interaction for service robots.

[7]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[8]  Wei Xiong,et al.  Query by video clip , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[9]  Edward Y. Chang,et al.  Optimal multimodal fusion for multimedia data analysis , 2004, MULTIMEDIA '04.

[10]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Jian Yang,et al.  Dominant Feature Vectors Based Audio Similarity Measure , 2004, PCM.

[12]  Stefan M. Rüger,et al.  NNk Networks for Content-Based Image Retrieval , 2004, ECIR.

[13]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[14]  Eric Bruno,et al.  Unsupervised event discrimination based on nonlinear temporal modeling of activity content , 2005, Pattern Analysis and Applications.

[15]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[16]  Marcel Worring,et al.  Similarity learning via dissimilarity space in CBIR , 2006, MIR '06.

[17]  Stéphane Marchand-Maillet,et al.  Asymmetric Learning and Dissimilarity Spaces for Content-Based Retrieval , 2006, CIVR.

[18]  Eric Bruno,et al.  Interactive partial matching of video sequences in large collections , 2005, IEEE International Conference on Image Processing 2005.

[19]  Edward Y. Chang,et al.  Statistical learning for effective visual information retrieval , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[20]  Jun Yang,et al.  Multi-modal analysis for person type classification in news video , 2005, IS&T/SPIE Electronic Imaging.

[21]  Shih-Fu Chang,et al.  Generative, discriminative, and ensemble learning on multi-modal perceptual fusion toward news video story segmentation , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[22]  Djoerd Hiemstra,et al.  Interactive Content-Based Retrieval Using Pre-computed Object-Object Similarities , 2004, CIVR.

[23]  Josef Kittler,et al.  Multiple Classifier Systems , 2004, Lecture Notes in Computer Science.

[24]  Christian Petersohn Fraunhofer HHI at TRECVID 2004: Shot Boundary Detection System , 2004, TRECVID.

[25]  Robert P. W. Duin,et al.  A Generalized Kernel Approach to Dissimilarity-based Classification , 2002, J. Mach. Learn. Res..

[26]  Thomas S. Huang,et al.  Small sample learning during multimedia retrieval using BiasMap , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[27]  Stéphane Marchand-Maillet,et al.  Learning User Queries in Multimodal Dissimilarity Spaces , 2005, Adaptive Multimedia Retrieval.

[28]  John R. Smith,et al.  Interactive search fusion methods for video database retrieval , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).