Automatic Recognition of Urban Soundscenes

In this paper we propose a novel architecture for environmental sound classification. In the first section we introduce the reader to the current work in this research field. Subsequently, we explore the usage of Mel frequency cepstral coefficients (MFCCs) and MPEG7 audio features in combination with a classification method based on Gaussian mixture models (GMMs). We provide details concerning the feature extraction process as well as the recognition stage of the proposed methodology. The performance of this implementation is evaluated by setting up experimental tests in six different categories of environmental sounds (aircraft, motorcycle, car, crowd, thunder, train). The proposed method is fast because it does not require high computational resources covering therefore the needs of a real time application.

[1]  Douglas Keislar,et al.  Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[2]  Ian T. Nabney,et al.  Netlab: Algorithms for Pattern Recognition , 2002 .

[3]  Thomas Sikora,et al.  MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval , 2005 .

[4]  Jhing-Fa Wang,et al.  Home environmental sound recognition based on MPEG-7 features , 2003, 2003 46th Midwest Symposium on Circuits and Systems.

[5]  Jhing-Fa Wang,et al.  Environmental Sound Classification using Hybrid SVM/KNN Classifier and MPEG-7 Audio Low-Level Descriptor , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[6]  M. Casey,et al.  MPEG-7 sound-recognition tools , 2001, IEEE Trans. Circuits Syst. Video Technol..

[7]  Jie Huang,et al.  Environmental sound recognition by multilayered neural networks , 2004, The Fourth International Conference onComputer and Information Technology, 2004. CIT '04..

[8]  Tao Xiong,et al.  A combined SVM and LDA approach for classification , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..