Story boundary detection in large broadcast news video archives: techniques, experience and trends

The segmentation of news video into story units is an important step towards effective processing and management of large news video archives. In the story segmentation task in TRECVID 2003, a wide variety of techniques were employed by many research groups to segment over 120-hour of news video. The techniques employed range from simple anchor person detector to soisticated machine learning models based on HMM and Maximum Entropy (ME) approaches. The general results indicate that the judicious use of multi-modality features coupled with rigorous machine learning models could produce effective solutions. This paper presents the algorithms and experience learned in TRECVID evaluations. It also points the way towards the development of scalable technology to process large news video corpuses.

[1]  Alan F. Smeaton,et al.  Dublin City University Video Track Experiments for TREC 2002 , 2001, TREC.

[2]  Tomas E. Ward,et al.  Segmentation and detection at IBM: Hybrid statistical models and two-tiered clustering broadcast new , 2000 .

[3]  Keiichiro Hoashi,et al.  Shot Boundary Determination on MPEC Compressed Domain and Story Segmentation Experiments for TRECVID 2003 , 2003, TRECVID.

[4]  Marti A. Hearst Multi-Paragraph Segmentation Expository Text , 1994, ACL.

[5]  Jeffrey C. Reynar An Automatic Method of Finding Topic Boundaries , 1994, ACL.

[6]  Philip Rennert StreamSage Unsupervised ASR-Based Topic Segmentation , 2003, TRECVID.

[7]  Dong-Jun Park,et al.  Experiments in Boundary Recognition at the University of Iowa , 2003, TRECVID.

[8]  Qi Tian,et al.  A Two-Level Multi-Modal Approach for Story Segmentation of Large News Video Corpus , 2003, TRECVID.

[9]  Jacqueline Vaissière,et al.  Language-Independent Prosodic Features , 1983 .

[10]  Jean-Luc Gauvain,et al.  The LIMSI Broadcast News transcription system , 2002, Speech Commun..

[11]  Shih-Fu Chang,et al.  Generative, discriminative, and ensemble learning on multi-modal perceptual fusion toward news video story segmentation , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[12]  Omar Javed,et al.  University of Central Florida at TRECVID 2004 , 2003, TRECVID.

[13]  Shih-Fu Chang,et al.  Discovery and fusion of salient multimodal features toward news story segmentation , 2003, IS&T/SPIE Electronic Imaging.

[14]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .