Multi-modal characteristics analysis and fusion for TV commercial detection

Automatic TV commercial detection has become an indispensable part of content-based video analysis technique due to the explosive growth in TV commercial volume. In this paper, a multi-modal (i.e. visual, audio and textual modalities) commercial digesting scheme is proposed to alleviate two challenges in commercial detection, which are the generation of mid-level semantic descriptor and the application of effective discrimination method. Compared with the general program, some unique semantic characteristics are purposely embedded in the commercial to grasp more attention from audience. Aiming at exploring the power of these semantic characteristics, a kind of novel commercial-oriented descriptor from textual modality is proposed, besides taking advantage of those commonly used description means in light of audio and visual modalities. To boost the ability of discrimination of commercial from general program in multi-modal representation space, Tri-AdaBoost, a self-learning method by an interactive way across multiple modalities, is introduced to form a final consolidated decision for discrimination. Moreover, a heuristic post processing strategy based on the temporal consistency is taken to further reduce the false alarms. The promising experimental results show the effectiveness of the proposed scheme with respect to large video data collections.

[1]  David S. Doermann,et al.  A video text detection system based on automated training , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[2]  Noel Murphy,et al.  Automatic TV advertisement detection from MPEG bitstream , 2002, Pattern Recognit..

[3]  Yo-Ping Huang,et al.  An Intelligent Subtitle Detection Model for Locating Television Commercials , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[4]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[5]  Lie Lu,et al.  Robust learning-based TV commercial detection , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[6]  J. David Schaffer,et al.  Evolvable visual commercial detector , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[7]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[8]  Yao Zhao,et al.  Robust Commercial Detection System , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[9]  Antonio Albiol,et al.  Detection of TV commercials , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  B. Ong Towards Automatic Music Structural Analysis: Identifying Characteristic Within-Song Excerpts in Popular Music , 2005 .

[11]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[12]  John M. Gauch,et al.  Finding and identifying unknown commercials using repeated video sequence detection , 2006, Comput. Vis. Image Underst..

[13]  Shahram Ebadollahi,et al.  Commercial detection in heterogeneous video streams using fused multi-modal and temporal features , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[14]  Ling-Yu Duan,et al.  A Multimodal Scheme for Program Segmentation and Representation in Broadcast Video Streams , 2008, IEEE Transactions on Multimedia.

[15]  Shumeet Baluja,et al.  Advertisement Detection and Replacement using Acoustic and Visual Repetition , 2006, 2006 IEEE Workshop on Multimedia Signal Processing.

[16]  Wolfgang Effelsberg,et al.  On the detection and recognition of television commercials , 1997, Proceedings of IEEE International Conference on Multimedia Computing and Systems.

[17]  Xavier Anguera Miró,et al.  Audio-based automatic management of TV commercials , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  C. V. Ramamoorthy,et al.  Knowledge and Data Engineering , 1989, IEEE Trans. Knowl. Data Eng..

[19]  Ling-Yu Duan,et al.  Digesting Commercial Clips from TV Streams , 2008, IEEE MultiMedia.