Subtopic segmentation in lecture speech for the creation of lecture video contents

Although still rare, video instructional materials which can be used over a network are on the increase. One of the reasons for the rarity of video instructional materials is thought to be the time and effort necessary for video editing. In this paper, the authors examine a method for automatically estimating subtopic segmentation positions from the speech information of an unedited lecture video, with the purpose of supporting the preparation of video instructional materials. Subtopic segmentation positions were estimated with comparisons of successive indexes using dynamic programming. The indexes were obtained by independent component analysis of text information attained from the speech recognition processing of the video. Through an experiment using unedited lecture video from five instructors, the proposed method was found to have a segmentation capacity equal to or better than the Hearst method, while allowing the number of segments to be set freely. It was also confirmed that the subtopic segmentation capacity using speech recognition output was equivalent to the use of transcribed text. © 2006 Wiley Periodicals, Inc. Syst Comp Jpn, 37(10): 13–21, 2006; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/scj.20540