Automatic Segmentation, Aggregation and Indexing of Multimodal News Information from Television and the Internet

The global diffusion of the Internet has enabled the distribution of informative content through dynamic media such as RSS feeds and video blogs. At the same time, the decreasing cost of electronic devices has increased the perva- sive availability of the same informative content in the form of digital audiovisual data. This article presents a system for the large-scale unsupervised acquisition, segmentation and index- ing of TV newscasts. In particular, it discusses the principles and performance of the parts of the system dedicated to the detection and segmentation of programmes from the acquired stream. In addition to the core technology, we also introduce and discuss a novel method for assessing the results of story boundaries segmentation algorithms, based on a user-validated measurement. Due to the heterogeneity of current news dis- tribution channels, a further innovative aspect of this article is the description of a framework for multimodal information ag- gregation. The core of this framework is a cross-modal cluster- ing process for which a novel, asymmetric similarity measure is provided. The implemented prototype uses online news articles and TV news programmes as information sources, and provides a multimodal service integrating both contributions. Experimental evaluation of the system proves the effectiveness of the method in the studied case.

[1]  Alberto Messina,et al.  Parallel neural networks for multimodal video genre classification , 2008, Multimedia Tools and Applications.

[2]  Jérôme Gensel,et al.  CLIPS LIS LSR LABRI Experiments in TREC Video Retrieval 2004 , 2004 .

[3]  Hideyuki Tamura,et al.  Textural Features Corresponding to Visual Perception , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  Keiichiro Hoashi,et al.  Shot Boundary Determination on MPEC Compressed Domain and Story Segmentation Experiments for TRECVID 2003 , 2003, TRECVID.

[5]  Hugh E. Williams,et al.  RMIT University at TRECVID 2004 , 2004, TRECVID.

[6]  Paul Over,et al.  TRECVID 2004 - An Overview , 2004, TRECVID.

[7]  Lawrence Wai-Choong Wong,et al.  ANSES: Summarisation of News Video , 2003, CIVR.

[8]  Changsheng Xu,et al.  Semantic Event Extraction from Basketball Games using Multi-Modal Analysis , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[9]  Fabio Brugnara,et al.  A system for the segmentation and transcription of Italian Audio News , 2000, RIAO.

[10]  Mario Vento,et al.  Unsupervised News Video Segmentation by Combined Audio-Video Analysis , 2006, MRCS.

[11]  Shih-Fu Chang,et al.  Story boundary detection in large broadcast news video archives: techniques, experience and trends , 2004, MULTIMEDIA '04.

[12]  Changsheng Xu,et al.  A Novel Framework for Semantic Annotation and Personalized Retrieval of Sports Video , 2008, IEEE Transactions on Multimedia.

[13]  Omar Javed,et al.  University of Central Florida at TRECVID 2004 , 2003, TRECVID.

[14]  Somnath Banerjee,et al.  Clustering short texts using wikipedia , 2007, SIGIR.

[15]  Roberto Basili,et al.  RitroveRAI: A Web Application for Semantic Indexing and Hyperlinking of Multimedia News , 2005, SEMWEB.

[16]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[17]  Shih-Fu Chang,et al.  Discovery and fusion of salient multimodal features toward news story segmentation , 2003, IS&T/SPIE Electronic Imaging.

[18]  Paul Deléglise,et al.  The LIUM speech transcription system: a CMU Sphinx III-based system for French broadcast news , 2005, INTERSPEECH.

[19]  Xin Li,et al.  A novel clustering-based RSS aggregator , 2007, WWW '07.

[20]  Tat-Seng Chua,et al.  Fusion of AV features and external information sources for event detection in team sports video , 2006, TOMCCAP.