论文信息 - Impulsive Environment Sound Detection by Neural Classification of Spectrogram and Mel-Frequency Coefficient Images

Impulsive Environment Sound Detection by Neural Classification of Spectrogram and Mel-Frequency Coefficient Images

The problem of automatic detecting impulsive sounds such as human sound (screams, shout), gun shots, machine gun, thunder, fire alarm, and car horn are useful for hearing impairment person. In this paper, instead of filtering the frequency of each sound for identifying types of sound, the frequency of sound is transformed into a recognizable image. The transformation is based on audio spectrogram (Power Spectrum) and Mel-frequency cepstral coefficients (MFCC). The images of both power spectrum and Mel-frequency spectrum are used as the inputs for an artificial neural network to recognize the corresponding sound. The proposed technique is tested with six different types of sound, i.e. machine gun, human scream, gun shot, thunder, fire alarm, and car horn from a sound database containing more than one hour of six different impulsive sounds. The experimental results on impulsive sounds detection using a spectrogram with feed-forward neuron network can effectively detect the segments of impulsive sound region in audio signal with more than 94% accuracy.

Chidchanok Lursinsap | Thanapant Raicharoen | Peerapol Khunarsa

[1] Hsin-Min Wang,et al. Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[2] Chong-Wah Ngo,et al. ICA-FX features for classification of singing voice and instrumental sound , 2004, ICPR 2004.

[3] Shankar Vembu,et al. Separation of Vocals from Polyphonic Audio Recordings , 2005, ISMIR.

[4] James S. Walker,et al. A Primer on Wavelets and Their Scientific Applications , 1999 .

[5] Pawel Zwan. Expert System for Automatic Classification and Quality Assessment of Singing Voices , 2006 .

[6] J. C. Fu,et al. A matching pursuit approach to small drill bit breakage prediction , 1999 .

[7] Bozena Kostek. Perception-Based Data Processing in Acoustics: Applications to Music Information Retrieval and Psychophysiology , 2005, Studies in Computational Intelligence.

[8] Andrzej Czyzewski,et al. Representing Musical Instrument Sounds for Their Automatic Classification , 2001 .

[9] Manabu Kotani,et al. Application of independent component analysis to detection of gas leakage sound , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[10] Daniel P. W. Ellis,et al. Locating singing voice segments within music signals , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[11] Ye Wang,et al. Singing voice detection in popular music , 2004, MULTIMEDIA '04.

[12] Changsheng Xu,et al. Singing voice detection using twice-iterated composite Fourier transform , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[13] Issam Abu-Mahfouz,et al. Drilling wear detection and classification using vibration signals and artificial neural network , 2003 .