Segmenting Lecture Videos by Topic: From Manual to Automated Methods

More and more universities and corporations are starting to provide videotaped lectures online for knowledge sharing and learning. Segmenting lecture videos into short clips by topic can extract the hidden information structure of the videos and facilitate information searching and learning. Manual segmentation has high accuracy rates but is very labor intensive. In order to develop a high performance automated segmentation method for lecture videos, we conducted a case study to learn the segmentation process of humans and the effective segmentation features used in the process. Based on the findings from the case study, we designed an automated segmentation approach with two phases: initial segmentation and segmentation refinement. The approach combines segmentation features from three information sources of video (speech text transcript, audio and video) and makes use of various knowledge sources such as world knowledge and domain knowledge. Our preliminary results show that the proposed two-phase approach is promising.

[1]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[2]  Brian Christopher Smith,et al.  Passive capture and structuring of lectures , 1999, MULTIMEDIA '99.

[3]  G. Youmans A New Tool for Discourse Analysis: The Vocabulary-Management Profile. , 1991 .

[4]  Gökhan Tür,et al.  Prosody-based automatic segmentation of speech into sentences and topics , 2000, Speech Commun..

[5]  Jay F. Nunamaker,et al.  Automated Video Segmentation for Lecture Videos: A Linguistics-Based Approach , 2005, Int. J. Technol. Hum. Interact..

[6]  Stephen W. Smoliar,et al.  Developing power tools for video indexing and retrieval , 1994, Electronic Imaging.

[7]  Jay F. Nunamaker,et al.  Question answering on lecture videos: a multifaceted approach , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[8]  Min-Yen Kan,et al.  Linear Segmentation and Segment Significance , 1998, VLC@COLING/ACL.

[9]  Gökhan Tür,et al.  Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation , 2001, CL.

[10]  Mitchell P. Marcus,et al.  Topic segmentation: algorithms and applications , 1998 .

[11]  David M. Blei,et al.  Topic segmentation with an aspect hidden Markov model , 2001, SIGIR '01.

[12]  Richard L. Daft,et al.  Organizational information requirements, media richness and structural design , 1986 .

[13]  Jay F. Nunamaker,et al.  Can People Be Trained to Better Detect Deception? Instructor-Led vs. Web-Based Training , 2003, AMCIS.

[14]  Michael Halliday,et al.  Cohesion in English , 1976 .

[15]  Francis K. H. Quek,et al.  Gesture, speech, and gaze cues for discourse segmentation , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[16]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[17]  Dongsong Zhang,et al.  Virtual mentor and media structuralization theory , 2002 .

[18]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[19]  Howard D. Wactlar,et al.  Informedia - Search and Summarization in the Video Medium , 2000 .

[20]  Jonathan Yamron,et al.  Topic Tracking in a News Stream , 1999 .

[21]  Francis K. H. Quek,et al.  Catchments, prosody and discourse , 2001 .

[22]  D. Ausubel The use of advance organizers in the learning and retention of meaningful verbal material. , 1960 .

[23]  Chong-Wah Ngo,et al.  Structuring lecture videos for distance learning applications , 2003, Fifth International Symposium on Multimedia Software Engineering, 2003. Proceedings..

[24]  Badrul H. Khan,et al.  Web-based training: an introduction , 2001 .