Towards to an automatic semantic annotation for multimedia learning objects

The number of digital video recordings has increased dramatically. The idea of recording lectures, speeches, and other academic events is not new. But, the accessibility and traceability of its content for further use is rather limited. Searching multimedia data, in particular audiovisual data, is still a challenging task to overcome. We describe and evaluate a new approach to generate asemantic annotation for multimedia resources, i.e., recorded university lectures. Speech recognition is applied to create atentative and deficient transliteration of the video recordings. We show that the imperfect transliteration is sufficient to generate semantic metadata serialized in an OWL file. The semantic annotation process based on textual material and deficient transliterations of lecture recordings are discussed and evaluated.

[1]  Harald Sack,et al.  Automated Annotation of Synchronized Multimedia Presentations , 2006 .

[2]  Christoph Meinel,et al.  Semantic indexing for recorded educational lecture videos , 2006, Fourth Annual IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOMW'06).

[3]  James F. Allen Natural language understanding , 1987, Bejnamin/Cummings series in computer science.

[4]  Wei Jyh Heng,et al.  Automatic synchronization of speech transcript and slides in presentation , 2003, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03..

[5]  Nicu Sebe,et al.  Affective Meeting Video Analysis , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[6]  Thomas C. Schmidt,et al.  Reasoning about eLearning Multimedia Objects ? , 2006 .

[7]  José A. Macías,et al.  Ontology-Based Retrieval of Human Speech , 2007 .

[8]  Ruslan Mitkov,et al.  The Oxford handbook of computational linguistics , 2003 .

[9]  Yasuo Ariki,et al.  Topic segmentation and retrieval system for lecture videos based on spontaneous speech recognition , 2003, INTERSPEECH.

[10]  Christoph Meinel,et al.  Semantic Composition of Lecture Subparts for a Personalized e-Learning , 2007, ESWC.

[11]  Christoph Meinel,et al.  Resolving Ambiguities in the Semantic Interpretation of Natural Language Questions , 2006, IDEAL.

[12]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[13]  Christoph Meinel,et al.  Segmenting of Recorded Lecture Videos - The Algorithm VoiceSeg , 2006, SIGMAP.

[14]  R. Mitkov The Oxford Handbook of Computational Linguistics (Oxford Handbooks) , 2003 .

[15]  John R. Kender,et al.  Augmented segmentation and visualization for presentation videos , 2005, MULTIMEDIA '05.

[16]  Chong-Wah Ngo,et al.  Structuring lecture videos for distance learning applications , 2003, Fifth International Symposium on Multimedia Software Engineering, 2003. Proceedings..

[17]  Renate A. Schmidt,et al.  Terminological Representation, Natural Language & Relation Algebra , 1992, GWAI.

[18]  Harald Sack,et al.  Integrating Social Tagging and Document Annotation for Content-Based Search in Multimedia Data , 2006, SAAW@ISWC.

[19]  Yingying Zhu,et al.  Video browsing and retrieval based on multimodal integration , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[20]  Oliver Vornberger,et al.  Hypermedia Navigation Concepts for Lecture Recordings , 2004 .

[21]  Erich J. Neuhold,et al.  LectureLounge – experience education beyond the borders of the classroom , 2003, International Journal on Digital Libraries.

[22]  Wolfgang Hürst,et al.  A Qualitative Study Towards Using Large Vocabulary Automatic Speech Recognition to Index Recorded Presentations for Search and Access over the Web , 2002, ICWI.

[23]  Chong-Wah Ngo,et al.  Prediction-Based Gesture Detection in Lecture Videos by Combining Visual, Speech and Electronic Slides , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[24]  Alberto Del Bimbo,et al.  MOM: multimedia ontology manager. A framework for automatic annotation and semantic retrieval of video sequences , 2006, MM '06.