An approach for automated video indexing and video search in large lecture video archives

E-Learning is the use of educational technology, communication and information technologies and electronic media in education. E learning contains various types of media including images, video, audio, streaming videos, animation, web based learning, video based learning, audio based learning, E books etc. Distance learning can be done without school or collages, anyone can learn from their home or office. ELearning industry is economically remarkable and it was work out in 2000 to be above 50$ billion corresponding to traditionalist estimates. Lecture Audio, video data on internet is growing rapidly. Hence there is immediate need for method by which we can retrieve audio, videos on internet. In this paper we have presented a technology for video search in lecture video archive. Initially, we can introduce segmentation of videos and key frame detection for offering rules for navigation of video contents. By applying ASR (Automatic Speech Recognition) on lecture audio and OCR (Optical Character Recognition) on video content we can extract metadata. OCR can be used in Data entry for business document, Automatic Number plate Recognition, Extracting business card information into a contact list and so on. ICR (Intelligent character recognition) focuses on handwritten documents as well as cursive character one at a time usually it involves in Machine Learning. Speech recognition system can classify into continuous or discrete system which can be speaker independent, speaker dependent or adaptive. Discrete system focuses on a separate acoustic model for each single word, sentence, phrase etc. are said to be isolated word speech recognition (ISR). CSR (Continuous Speech Recognition) System focuses on user who speaks sentences continually.

[1]  Ellen M. Voorhees,et al.  The TREC Spoken Document Retrieval Track: A Success Story , 2000, TREC.

[2]  John R. Kender,et al.  Augmented segmentation and visualization for presentation videos , 2005, MULTIMEDIA '05.

[3]  Wolfgang Hürst,et al.  A Qualitative Study Towards Using Large Vocabulary Automatic Speech Recognition to Index Recorded Presentations for Search and Access over the Web , 2002, ICWI.

[4]  Ming Zhao,et al.  Sparse Representation Classification for Image Text Detection , 2009, 2009 Second International Symposium on Computational Intelligence and Design.

[5]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Harald Sack,et al.  A framework for improved video text detection and recognition , 2014, Multimedia Tools and Applications.

[7]  James R. Glass,et al.  Analysis and Processing of Lecture Audio Data: Preliminary Investigations , 2004, Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004 - SpeechIR '04.

[8]  Mauro Cettolo,et al.  Language modeling and transcription of the TED corpus lectures , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[9]  Gary Geunbae Lee,et al.  A Korean Spoken Document Retrieval System for Lecture Search , 2008 .

[10]  Gerald Penn,et al.  Automatic speech recognition for webcasts: how good is good enough and what to do when it isn't , 2006, ICMI '06.

[11]  Ching Y. Suen,et al.  Text Segmentation from Complex Background Using Sparse Representations , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[12]  Jean-Philippe Thiran,et al.  Text identification in complex background using SVM , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[13]  Alex Acero,et al.  INTEGRATION OF METADATA IN SPOKEN DOCUMENT SEARCH USING POSITION SPECIFIC POSTERIOR LATICES , 2006, 2006 IEEE Spoken Language Technology Workshop.

[14]  Ching Y. Suen,et al.  Text detection from scene images using sparse representation , 2008, 2008 19th International Conference on Pattern Recognition.

[15]  Datong Chen,et al.  Text detection and recognition in images and video sequences , 2003 .