Multi-modal Interview Concept Detection for Rushes Exploitation

According to the concepts of Large-Scale Concept Ontology for Multimedia (LSCOM) and requirement of the 4th task in the 2006 TRECVID, i.e., rushes exploitation, the "interview" concept is an important semantic concept for rushes content analysis. The paper presents the shot-level "interview" concept detection method. Face detection and audio classification are implemented to detect "face" and "speech" concepts for each shot. By integrating audiovisual information, "interview" concept is finally detected. The utilization of the method will definitely benefit the video edit. Large-scale experimental results strongly demonstrate the accuracy and effectiveness of the proposed method.

[1]  Bai Liang,et al.  Feature analysis and extraction for audio automatic classification , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[2]  Sheng Tang,et al.  TRECVID 2006 Rushes Exploitation by CAS MCG , 2006, TRECVID.

[3]  Chun Chen,et al.  Subspace analysis and optimization for AAM based face alignment , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[4]  Bradley P. Allen,et al.  Searching for Relevant Video Shots in BBC Rushes Using Semantic Web Techniques , 2005, TRECVID.

[5]  Jin-Hau Kuo,et al.  A hierarchical and multi-modal based algorithm for lead detection and news program narrative parsing , 2005, 19th International Conference on Advanced Information Networking and Applications (AINA'05) Volume 1 (AINA papers).

[6]  William M. Campbell,et al.  Support vector machines for speaker verification and identification , 2000, Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No.00TH8501).

[7]  Edward J. Delp,et al.  The indexing of persons in news sequences using audio-visual data , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[8]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[9]  Songyang Lao,et al.  Feature analysis and extraction for audio automatic classification , 2005, SMC.

[10]  Mubarak Shah,et al.  Semantic classification of movie scenes using finite state machines , 2005 .