Audio scene semantic similarity computing approach

Audio in the video carries abundant semantic message. An audio scene is temporal audio segments which represented by a few basic audio effects. The semantic similarity of pair audio scenes is very useful for high-level audio semantic understanding. A computing approach for audio scene semantic similarity is proposed in this paper. Firstly, audio track is pre-segmented to audio scenes. Then, basic audio effects dominating each audio scene are recognized. Finally, the similarity of two audio scenes is calculated based on a model consist with information theoretic similarity principles and Tversky's set-theoretic similarity. The results of experiments indicate the audio scene semantic similarity computing approach could count quantitative semantic similarity of two scenes.

[1]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[2]  A. Tversky Features of Similarity , 1977 .

[3]  Shih-Fu Chang,et al.  Audio scene segmentation using multiple features, models and time scales , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[4]  E. Rosch Cognitive Representations of Semantic Categories. , 1975 .

[5]  Lie Lu,et al.  Highlight sound effects detection in audio stream , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[6]  Thomas Sikora,et al.  Speaker recognition using MPEG-7 descriptors , 2003, INTERSPEECH.

[7]  Lie Lu,et al.  Unsupervised auditory scene categorization via key audio effects and information-theoretic co-clustering , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[8]  Liu Feng-yu Study of Basic Audio Semantic Analysis and Extraction Techniques for Video Data , 2007 .

[9]  C. Krumhansl Concerning the applicability of geometric models to similarity data: The interrelationship between similarity and spatial density. , 1978 .

[10]  C. Krumhansl Concerning the Applicability of Geometric Models to Similarity Data : The Interrelationship Between Similarity and Spatial Density , 2005 .

[11]  Shih-Fu Chang,et al.  Overview of the MPEG-7 standard , 2001, IEEE Trans. Circuits Syst. Video Technol..