High-level event detection in video exploiting discriminant concepts

In this paper a new approach to video event detection is presented, combining visual concept detection scores with a new dimensionality reduction technique. Specifically, a video is first decomposed to a sequence of shots, and trained visual concept detectors are used to represent video content with model vector sequences. Subsequently, an improved subclass discriminant analysis method is used to derive a concept subspace for detecting and recognizing high-level events. In this space, the median Hausdorff distance is used to implicitly align and compare event videos of different lengths, and the nearest neighbor rule is used for recognizing the event depicted in the video. Evaluation results obtained by our participation in the Multimedia Event Detection Task of the TRECVID 2010 competition verify the effectiveness of the proposed approach for event detection and recognition in large scale video collections.

[1]  Yiannis Kompatsiaris,et al.  On the Use of Visual Soft Semantics for Video Temporal Decomposition to Scenes , 2010, 2010 IEEE Fourth International Conference on Semantic Computing.

[2]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[3]  Paul Over,et al.  High-level feature detection from video in TRECVid: a 5-year retrospective of achievements , 2009 .

[4]  John R. Smith,et al.  Multimedia semantic indexing using model vectors , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[5]  Yiannis Kompatsiaris,et al.  ITI-CERTH participation to TRECVID 2009 HLFE and Search , 2009, TRECVID.

[6]  Aleix M. Martínez,et al.  Subclass discriminant analysis , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Gang Hua,et al.  IBM Research TRECVID-2010 Video Copy Detection and Multimedia Event Detection System , 2010, TRECVID.

[8]  Dong Xu,et al.  Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[10]  Chong-Wah Ngo,et al.  Video event detection using motion relativity and visual relatedness , 2008, ACM Multimedia.

[11]  Mubarak Shah,et al.  Columbia-UCF TRECVID2010 Multimedia Event Detection: Combining Multiple Modalities, Contextual Concepts, and Temporal Matching , 2010, TRECVID.

[12]  Yiannis Kompatsiaris,et al.  Gradual transition detection using color coherence and other criteria in a video shot meta-segmentation framework , 2008, 2008 15th IEEE International Conference on Image Processing.

[13]  Yiannis Kompatsiaris,et al.  MESH participation to TRECVID2008 HLFE , 2008, TRECVID.

[14]  Alper Yildirim,et al.  An Alternative Model for Target Position Estimation in Radar Processors , 2007, IEEE Signal Processing Letters.

[15]  Jeffrey M. Zacks,et al.  Human brain activity time-locked to perceptual event boundaries , 2001, Nature Neuroscience.

[16]  Alberto Del Bimbo,et al.  Video Annotation and Retrieval Using Ontologies and Rule Learning , 2010, IEEE MultiMedia.

[17]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[18]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.