Detecting indexical signs in film audio for scene interpretation

In this paper, we study the sound tracks in films and their indexical semiotic usage by developing a classification system that detects complex sound scenes and their constituent sound events in cinema. We investigate two main issues in this paper: Determination of what constitutes the presence of a high level sound scene and inferences about the thematic content of the scene that can be drawn from this presence, and classification of environmental sounds in the audio track of the scene, to assist in the automatic detection of the high level scene. Experiments with our classification system on pure sounds resulted in a correct event classification rate of 88.9%. When the audio content of a number of film scenes was examined, though a lower accuracy resulted with sound event detection due to the presence of mixed sounds, the film audio samples were generally classified with the correct high-level sound scene label, enabling correct inferences about the story content of the scenes.

[1]  C.-C. Jay Kuo,et al.  Classification and retrieval of sound effects in audiovisual data management , 1999, Conference Record of the Thirty-Third Asilomar Conference on Signals, Systems, and Computers (Cat. No.CH37020).

[2]  Wolfgang Effelsberg,et al.  Automatic audio content analysis , 1997, MULTIMEDIA '96.

[3]  Hiroshi Hamada,et al.  Video Handling with Music and Speech Detection , 1998, IEEE Multim..

[4]  Jonathan Foote,et al.  Content-based retrieval of music and audio , 1997, Other Conferences.

[5]  Tsuhan Chen,et al.  Audio feature extraction and analysis for scene classification , 1997, Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing.

[6]  Douglas Keislar,et al.  Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..