The study of documentary segmentation through audio and text understanding

Video segmentation is the first and critical step in video indexing and retrieval. Previous work in this area has primarily focused on visual and audio information. In this paper, we investigate the segmentation of documentary video data through audio and text understanding. To segment a continuous documentary video stream into subtopic segments, music markers and domain-independent video speech text segmentation are explored. Experiments are presented and discussed.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[3]  Lie Lu,et al.  A robust audio classification and segmentation method , 2001, MULTIMEDIA '01.

[4]  John Saunders,et al.  Real-time discrimination of broadcast speech/music , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[5]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[6]  Michael Halliday,et al.  Cohesion in English , 1976 .

[7]  C.-C. Jay Kuo,et al.  Heuristic approach for generic audio data segmentation and annotation , 1999, MULTIMEDIA '99.

[8]  Honglin Li,et al.  Semantic Segmentation of Documentary Video using Music Breaks , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[9]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Tsuhan Chen,et al.  Audio Feature Extraction and Analysis for Scene Segmentation and Classification , 1998, J. VLSI Signal Process..

[11]  Zhu Liu,et al.  Multimedia content analysis-using both audio and visual clues , 2000, IEEE Signal Process. Mag..

[12]  Zhu Liu,et al.  Integration of audio and visual information for content-based video segmentation , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[13]  Jonathan Yamron,et al.  Topic Tracking in a News Stream , 1999 .

[14]  John H. L. Hansen,et al.  Advances in unsupervised audio classification and segmentation for the broadcast news and NGSW corpora , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Jake K. Aggarwal,et al.  Combining structure, color and texture for image retrieval: A performance evaluation , 2002, Object recognition supported by user interaction for service robots.

[16]  G. Youmans A New Tool for Discourse Analysis: The Vocabulary-Management Profile. , 1991 .

[17]  George Tzanetakis,et al.  Multifeature audio segmentation for browsing and annotation , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[18]  W. Bruce Croft,et al.  Text Segmentation by Topic , 1997, ECDL.