Content-based Table Tennis Games Highlight Detection Utilizing Audiovisual Clues

Both audio and video are considered as important information carriers of multimedia content. In this paper, we propose an algorithm utilizing audiovisual clues for a scenario of sports game highlight detection, where the highlight detection for table tennis games are studied. Since audio and video contain different aspects of information that is helpful to locate highlights, we build two algorithms detecting highlight candidates based on audio and video, respectively, where hidden Markov model (HMM) audio keyword modeling and unsupervised shot clustering are applied. Decision fusion is invoked to combine audio and video highlight candidates and generate final highlights. Promising experimental results up to 90 % average precision are achieved.

[1]  Thomas Jansen,et al.  On the analysis of the (1+1) evolutionary algorithm , 2002, Theor. Comput. Sci..

[2]  Thomas Bäck,et al.  Evolutionary computation: comments on the history and current state , 1997, IEEE Trans. Evol. Comput..

[3]  Liming Chen,et al.  Highlights Detection in Sports Videos Based on Audio Analysis , 2003 .

[4]  Yap-Peng Tan,et al.  An effective post-refinement method for shot boundary detection , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[5]  C.-C. Jay Kuo,et al.  Audio content analysis for online audiovisual data segmentation and classification , 2001, IEEE Trans. Speech Audio Process..

[6]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[7]  Qi Tian,et al.  A unified framework for semantic shot classification in sports video , 2002, IEEE Transactions on Multimedia.

[8]  Yang-Ming Zhu,et al.  An object-oriented framework for medical image registration, fusion, and visualization , 2006, Comput. Methods Programs Biomed..

[9]  Max A. Viergever,et al.  A survey of medical image registration , 1998, Medical Image Anal..

[10]  Wen Gao,et al.  Unsupervised sports video scene clustering and its applications to story units detection , 2005, Visual Communications and Image Processing.

[11]  Chng Eng Siong,et al.  Sports highlight detection from keyword sequences using HMM , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[12]  Max A. Viergever,et al.  Mutual-information-based registration of medical images: a survey , 2003, IEEE Transactions on Medical Imaging.

[13]  Gary E. Christensen,et al.  Consistent image registration , 2001, IEEE Transactions on Medical Imaging.

[14]  Guy Marchal,et al.  Multimodality image registration by maximization of mutual information , 1997, IEEE Transactions on Medical Imaging.

[15]  Alberto Del Bimbo,et al.  Common Visual Cues for Sports Highlights Modeling , 2005, Multimedia Tools and Applications.

[16]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[17]  Lisa M. Brown,et al.  A survey of image registration techniques , 1992, CSUR.

[18]  Qi Tian,et al.  Mean shift based video segment representation and applications to replay detection , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[19]  Dinggang Shen,et al.  A General Learning Framework for Non-rigid Image Registration , 2006, MIAR.

[20]  Ingo Wegener,et al.  A Rigorous Complexity Analysis of the (1 + 1) Evolutionary Algorithm for Separable Functions with Boolean Inputs , 1998, Evolutionary Computation.

[21]  Qingming Huang,et al.  A scheme for racquet sports video analysis with the combination of audio-visual information , 2005, Visual Communications and Image Processing.