Learning visual models of semantic concepts

Statistical machine learning provides a computational framework for mapping low level media features to high level semantics concepts. In this paper we expose the challenges that these techniques face. Using support vector machine (SVM) classification we build models for 34 semantic concepts for the TREC 2002 benchmark corpus. We study the effect of number of examples available for training with respect to their impact on detection. We also examine low level feature fusion as well as parameter sensitivity with SVM classifiers.

[1]  John R. Smith,et al.  Multimedia semantic indexing using model vectors , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[2]  Haim H. Permuter,et al.  IBM Research TREC 2002 Video Retrieval System , 2002, TREC.

[3]  Brendan J. Frey,et al.  Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[4]  Thomas S. Huang,et al.  Factor graph framework for semantic video indexing , 2002, IEEE Trans. Circuits Syst. Video Technol..

[5]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[6]  John R. Smith,et al.  Modeling semantic concepts to support query by keywords in video , 2002, Proceedings. International Conference on Image Processing.