A Framework for Automatic Topic Segmentation in Video Lectures

Nowadays, video lectures are a very popular way to transmit knowledge, and because of that, there are many repositories with a large catalog of those videos on web. Despite all benefits that this high availability of video lectures brings, some problems also emerge from this scenario. One of these problems is that, it is very difficult find relevant content associate with those videos. Many times, students must to watch the entire video lecture to find the point of interest and, sometimes, these points are not found. For that reason, the proposal of this master’s project is to investigate and propose a novel framework based on early fusion of low and high-level audio features enriched with external knowledge from open databases for automatic topic segmentation in video lectures. We have performed preliminary experiments in two sets of video lectures using the current state of our work. The obtained results were very satisfactory, which evidences the potential of our proposal.

[1]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[2]  Yi Yu,et al.  TRACE: Linguistic-Based Approach for Automatic Lecture Video Segmentation Leveraging Wikipedia Texts , 2015, 2015 IEEE International Symposium on Multimedia (ISM).

[3]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[4]  Christoph Meinel,et al.  Content Based Lecture Video Retrieval Using Speech and Video Text Information , 2014, IEEE Transactions on Learning Technologies.

[5]  I. Elamvazuthi,et al.  Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques , 2010, ArXiv.

[6]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[7]  Eduardo R. Soares,et al.  Automatic Topic Segmentation for Video Lectures Using Low and High-Level Audio Features , 2018, WebMedia.

[8]  Hiroaki Ogata,et al.  Automatic Summarization of Lecture Slides for Enhanced Student Preview–Technical Report and User Study– , 2018, IEEE Transactions on Learning Technologies.

[9]  Christoph Meinel,et al.  Sentence-Level Automatic Lecture Highlighting Based on Acoustic Analysis , 2016, 2016 IEEE International Conference on Computer and Information Technology (CIT).

[10]  Fu-Hao Yeh,et al.  Robust Handwriting Extraction and Lecture Video Summarization , 2014, IIH-MSP.

[11]  Barry Arons,et al.  Pitch-based emphasis detection for segmenting speech recordings , 1994, ICSLP.

[12]  Francine R. Chen,et al.  The use of emphasis to automatically summarize a spoken discourse , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Yi Yu,et al.  ATLAS: Automatic Temporal Segmentation and Annotation of Lecture Videos Based on Modelling Transition Time , 2014, ACM Multimedia.

[14]  Jay F. Nunamaker,et al.  Segmenting Lecture Videos by Topic: From Manual to Automated Methods , 2005, AMCIS.

[15]  Jay F. Nunamaker,et al.  Instructional video in e-learning: Assessing the impact of interactive video on learning effectiveness , 2006, Inf. Manag..

[16]  Silvia Mirri,et al.  Topic-based playlist to improve video lecture accessibility , 2018, 2018 15th IEEE Annual Consumer Communications & Networking Conference (CCNC).

[17]  Zahir Tari,et al.  A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis , 2014, IEEE Transactions on Emerging Topics in Computing.

[18]  Jay F. Nunamaker,et al.  Automated Video Segmentation for Lecture Videos: A Linguistics-Based Approach , 2005, Int. J. Technol. Hum. Interact..

[19]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[20]  Yasuo Ariki,et al.  Topic segmentation and retrieval system for lecture videos based on spontaneous speech recognition , 2003, INTERSPEECH.

[21]  Seiichi Nakagawa,et al.  SUMMARIZATION OF SPOKEN LECTURES BASED ON LINGUISTIC SURFACE AND PROSODIC INFORMATION , 2006, 2006 IEEE Spoken Language Technology Workshop.

[22]  Rudinei Goularte,et al.  Video scene segmentation through an early fusion multimodal approach , 2016 .

[23]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.