Multi-granular detection of regional semantic concepts

A large number of interesting visual semantic concepts occur at a sub-frame granularity in images and occupy one or more regions at the sub-frame level. Detecting these concepts is a challenge due to segmentation imperfections. We propose multi-granular detection of visual concepts that have regional support. We build a single set of support vector machine based binary concept models from the training set with manually marked up regions. In this paper we show that detection can he significantly improved by scoring these models over multiple granularities in the test set images, where the regions are automatically detected a.$ a preprocessing step in detection. Using 27 regional semantic concepts from the NIST TRECVID 2003 common annotation lexicon and the corpus we demonstrate that multigranular detection leads to improvement in detection.

[1]  Ching-Yung Lin,et al.  Video Collaborative Annotation Forum: Establishing Ground-Truth Labels on Large Multimedia Datasets , 2003, TRECVID.

[2]  John R. Smith,et al.  Role of classifiers in multimedia content management , 2003, IS&T/SPIE Electronic Imaging.

[3]  Dragutin Petkovic,et al.  "What is in that Video Anyway?" In Search of Better Browsing , 1999, ICMCS, Vol. 1.

[4]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[5]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[6]  John R. Smith,et al.  A framework for moderate vocabulary semantic visual concept detection , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[7]  J.R. Smith,et al.  Learning visual models of semantic concepts , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).