Video Segmentation derived from speech using ICA

This paper proposes a method of segmentation that segments lecture video material into video scenes based on speech signals for creation of educational video contents. To represent subtopics of video scenes, the text recognized by ASR from a lecture speech was converted into an index using independent component analysis (ICA) instead of conventional TF-IDF. This research attempted a method of segmentation using dynamic programming that minimizes the sum of cosine measures between adjacent indexes that represent subtopics of video scenes. The validity of the proposed method was evaluated using a sample lecture video of approximately 60 minutes. Results indicated that the proposed method using ICA obtained approximately 96% recall without ASR errors. The method of segmentation using ICA was much faster than the method using TF-IDF because the size of the index using ICA was smaller than that of TF-IDF.