Video Segmentation derived from speech using ICA
暂无分享,去创建一个
This paper proposes a method of segmentation that segments lecture video material into video scenes based on speech signals for creation of educational video contents. To represent subtopics of video scenes, the text recognized by ASR from a lecture speech was converted into an index using independent component analysis (ICA) instead of conventional TF-IDF. This research attempted a method of segmentation using dynamic programming that minimizes the sum of cosine measures between adjacent indexes that represent subtopics of video scenes. The validity of the proposed method was evaluated using a sample lecture video of approximately 60 minutes. Results indicated that the proposed method using ICA obtained approximately 96% recall without ASR errors. The method of segmentation using ICA was much faster than the method using TF-IDF because the size of the index using ICA was smaller than that of TF-IDF.
[1] Katunobu ITOU,et al. A Lecture-On-demand System using Spoken Document Retrieval , 2001 .
[2] Ata Kabán,et al. Fast Extraction of Semantic Features from a Latent Semantic Indexed Text Corpus , 2004, Neural Processing Letters.
[3] Sadaoki Furui,et al. Topic extraction based on continuous speech recognition in broadcast-news speech , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[4] Tatsuya Kawahara,et al. Automatic Indexing of Lecture Speech by Extracting Discourse Makers , 2001 .