论文信息 - Incorporating Audio Cues into Dialog and Action Scene Extraction

Incorporating Audio Cues into Dialog and Action Scene Extraction

In this paper, we present an approach to extract scenes in video. The approach is top-down and uses video editing rules and audio cues to extract simple dialog and action scenes. The underlying model is a finite state machine coupled with audio cues that are determined using an audio classifier.

[1] Arif Ghafoor,et al. Object-oriented conceptual modeling of video data , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[2] Malcolm Slaney,et al. Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3] Douglas Keislar,et al. Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[4] Peter Kabal,et al. Speech/music discrimination for multimedia applications , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[5] Dragutin Petkovic,et al. Towards robust features for classifying audio in the CueVideo system , 1999, MULTIMEDIA '99.

[6] Yoshinobu Tonomura,et al. VideoMAP and VideoSpaceIcon: tools for anatomizing video content , 1993, INTERCHI.

[7] Wolfgang Effelsberg,et al. Scene Determination Based on Video and Audio Features , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[8] Thomas S. Huang,et al. Exploring video structure beyond the shots , 1998, Proceedings. IEEE International Conference on Multimedia Computing and Systems (Cat. No.98TB100241).

[9] Kikukawa Takeshi,et al. Development of an Automatic Summary Editing System for the Audio Visual Resources. , 1992 .

[10] Alan Hanjalic,et al. Automatically Segmenting Movies into Logical Story Units , 1999, VISUAL.

[11] Lei Chen,et al. Rule-based scene extraction from video , 2002, Proceedings. International Conference on Image Processing.

[12] Wolfgang Effelsberg,et al. Automatic audio content analysis , 1997, MULTIMEDIA '96.

[13] Masahito Hirakawa,et al. Content-based retrieval of video data by the grammar of film , 1997, Proceedings. 1997 IEEE Symposium on Visual Languages (Cat. No.97TB100180).

[14] John Saunders,et al. Real-time discrimination of broadcast speech/music , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[15] Rainer Lienhart,et al. Scene Determination Based on Video and Audio Features , 2004, Multimedia Tools and Applications.

[16] Lie Lu,et al. A robust audio classification and segmentation method , 2001, MULTIMEDIA '01.

[17] Behzad Shahraray,et al. Scene change detection and content-based sampling of video sequences , 1995, Electronic Imaging.

[18] Mingchun Liu,et al. A study on content-based classification and retrieval of audio database , 2001, Proceedings 2001 International Database Engineering and Applications Symposium.

[19] Boon-Lock Yeo,et al. Time-constrained clustering for segmentation of video into story units , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[20] Michael J. Carey,et al. A comparison of features for speech, music discrimination , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[21] C.-C. Jay Kuo,et al. Audio content analysis for online audiovisual data segmentation and classification , 2001, IEEE Trans. Speech Audio Process..

[22] Guojun Lu,et al. A technique towards automatic audio classification and retrieval , 1998, ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344).

[23] Akio Nagasaka,et al. Automatic Video Indexing and Full-Video Search for Object Appearances , 1991, VDB.

[24] Stephen W. Smoliar,et al. Video parsing, retrieval and browsing: an integrated and content-based solution , 1997, MULTIMEDIA '95.

[25] Atreyi Kankanhalli,et al. Automatic partitioning of full-motion video , 1993, Multimedia Systems.

[26] Stan Z. Li,et al. Content-based audio classification and retrieval using the nearest feature line method , 2000, IEEE Trans. Speech Audio Process..

[27] D. Arijon,et al. Grammar of Film Language , 1976 .

[28] Guojun Lu,et al. An investigation of automatic audio classification and segmentation , 2000, WCC 2000 - ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings. 16th World Computer Congress 2000.

[29] James L. Hein. Theory of computation: an introduction , 1996 .

[30] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[31] Joan E. Hart,et al. Film Directing Shot by Shot: Visualizing from Concept to Screen , 1991 .