Semantic concept detection for video based on extreme learning machine

Semantic concept detection is an important step in concept-based semantic video retrieval, which can be regarded as an intermediate descriptor to bridge the semantic gap. Most existing concept detection methods utilize Support Vector Machines (SVM) as concept classifier. However, there are several drawbacks of using SVM, such as the high computational cost and large number of parameters to be optimized. In this paper we propose an Extreme Learning Machine (ELM) based Multi-modality Classifier Combination Framework (MCCF) to improve the accuracy of semantic concept detection. In this framework: (i) three ELM classifiers are trained by exploring three kinds of visual features respectively, (ii) a probability-based fusion method is then proposed to combine the prediction results of each ELM classifier, (iii) we integrate the prediction results of ELM classifier with the information of contextual correlation among concepts to further improve the accuracy of semantic concept detection. Experiments on the widely used TRECVID datasets demonstrate that our approach can effectively improve the accuracy of semantic concept detection and achieve performance at extremely high speed.

[1]  Chong-Wah Ngo,et al.  Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.

[2]  Winston H. Hsu,et al.  Brief Descriptions of Visual Features for Baseline TRECVID Concept Detectors , 2006 .

[3]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[4]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[5]  Qing He,et al.  Extreme Support Vector Machine Classifier , 2008, PAKDD.

[6]  D. Serre Matrices: Theory and Applications , 2002 .

[7]  Marcel Worring,et al.  Adding Semantics to Detectors for Video Retrieval , 2007, IEEE Transactions on Multimedia.

[8]  Gordon V. Cormack,et al.  Validity and power of t-test for comparing MAP and GMAP , 2007, SIGIR.

[9]  Wei-Ying Ma,et al.  Recent Advances and Challenges of Semantic Image/Video Search , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[10]  John R. Smith,et al.  Multimedia semantic indexing using model vectors , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[11]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[12]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[13]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Shih-Fu Chang Recent Advances and Open Issues of Digital Image/Video Search , 2007, Eighth International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS '07).

[15]  John R. Smith,et al.  On the detection of semantic concepts at TRECVID , 2004, MULTIMEDIA '04.

[16]  Guang-Bin Huang,et al.  Convex incremental extreme learning machine , 2007, Neurocomputing.

[17]  Chee Kheong Siew,et al.  Can threshold networks be trained directly? , 2006, IEEE Transactions on Circuits and Systems II: Express Briefs.

[18]  Fuzhen Zhuang,et al.  A parallel incremental extreme SVM classifier , 2011, Neurocomputing.

[19]  Bo Lu,et al.  Multi-information fusion for uncertain semantic representations of videos , 2010, CIKM '10.

[20]  Lei Chen,et al.  Enhanced random search based incremental extreme learning machine , 2008, Neurocomputing.

[21]  John R. Smith,et al.  Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues , 2003, EURASIP J. Adv. Signal Process..

[22]  Marcel Worring,et al.  Multimodal Video Indexing : A Review of the State-ofthe-art , 2001 .

[23]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[24]  Hongming Zhou,et al.  Optimization method based extreme learning machine for classification , 2010, Neurocomputing.

[25]  Shih-Fu Chang,et al.  A reranking approach for context-based concept fusion in video indexing and retrieval , 2007, CIVR '07.

[26]  Shih-Fu Chang,et al.  Columbia University’s Baseline Detectors for 374 LSCOM Semantic Visual Concepts , 2007 .