Multi-modal summarization of key events and top players in sports tournament videos

To detect and annotate the key events of live sports videos, we need to tackle the semantic gaps of audio-visual information. Previous work has successfully extracted semantic from the time-stamped web match reports, which are synchronized with the video contents. However, web and social media articles with no time-stamps have not been fully leveraged, despite they are increasingly used to complement the coverage of major sporting tournaments. This paper aims to address this limitation using a novel multimodal summarization framework that is based on sentiment analysis and players' popularity. It uses audiovisual contents, web articles, blogs, and commentators' speech to automatically annotate and visualize the key events and key players in a sports tournament coverage. The experimental results demonstrate that the automatically generated video summaries are aligned with the events identified from the official website match reports.

[1]  Marios Savvides,et al.  An analysis of facial shape and texture for recognition: A large scale evaluation on FRGC ver2.0 , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[2]  Tat-Seng Chua,et al.  Fusion of AV features and external information sources for event detection in team sports video , 2006, TOMCCAP.

[3]  Yi-Ping Phoebe Chen,et al.  The power of play-break for automatic detection and browsing of self-consumable sport video highlights , 2004, MIR '04.

[4]  Michael J. Pazzani,et al.  A hybrid user model for news story classification , 1999 .

[5]  Changsheng Xu,et al.  Using Webcast Text for Semantic Event Detection in Broadcast Sports Video , 2008, IEEE Transactions on Multimedia.

[6]  Huang-Chia Shih,et al.  Content Extraction and Interpretation of Superimposed Captions for Broadcasted Sports Videos , 2008, IEEE Transactions on Broadcasting.

[7]  Hui Zhang,et al.  A sports video browsing and retrieval system based on multimodal analysis: SportsBR , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[8]  Patrick Gros,et al.  Audiovisual integration for tennis broadcast structuring , 2006, Multimedia Tools and Applications.

[9]  Nandan Parameswaran,et al.  Survey of Sports Video Analysis: Research Issues and Applications , 2003, VIP.

[10]  Kannappan Palaniappan,et al.  A novel framework for semantic annotation of soccer sports video sequences , 2008 .

[11]  Dan Roth,et al.  Probabilistic Reasoning for Entity & Relation Recognition , 2002, COLING.