Noise-Robust environmental sound classification method based on combination of ICA and MP features

This paper presents an environmental sound classification method that is noise-robust against sounds recorded by mobile devices, and presents evaluation of its performance. This method is specifically designed to recognize higher semantics of context from environmental sound. Conventionally, sound classifications have used acoustic features in the frequency domain extracted from sound data using signal processing techniques. Although the most popular feature is Mel-frequency Cepstral Coefficients (MFCC), MFCC is inappropriate for mixture sound with noise. Independent Component Analysis (ICA) can extract sound characteristics even when the source is corrupted by noise because components within the source are assumed to be independent. In recent years, Matching Pursuit (MP) has been addressed to extract time-domain features. It has been applied to various applications. The feature is effective for recognizing and classifying environmental sounds that include time-variant sound such as birdsongs, alarms, and vehicle sounds. In this way, some innovative techniques have been proposed to recognize and classify environmental sounds recorded on mobile devices. However, we have not yet obtained a decisive method to attain a higher recognition and classification rate against environmental sounds with various noises such as unintended sounds and white noise. To address this problem, we propose a noise-robust classification method using a combination of Independent Component Analysis (ICA) and MP. It is possible to reduce noise effects for feature extraction. From performance evaluations, we confirmed that the proposed method can provide about 8% better classification than that of MFCC feature extraction.

[1]  Ben P. Milner,et al.  Context awareness using environmental noise classification , 2003, INTERSPEECH.

[2]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[3]  Waltenegus Dargie,et al.  Adaptive Audio-Based Context Recognition , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[4]  Cheng Xu,et al.  An Improving MFCC Features Extraction Based on FastICA Algorithm plus RASTA Filtering , 2011, J. Comput..

[5]  Shrikanth Narayanan,et al.  Environmental Sound Recognition With Time–Frequency Audio Features , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Marian Stewart Bartlett,et al.  Face recognition by independent component analysis , 2002, IEEE Trans. Neural Networks.

[7]  Jean-François Cardoso,et al.  Multidimensional independent component analysis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[8]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[9]  Paul Lukowicz,et al.  OPPORTUNITY: Towards opportunistic activity and context recognition systems , 2009, 2009 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks & Workshops.

[10]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[11]  Alexander H. Waibel,et al.  Temporal ICA for classification of acoustic events i a kitchen environment , 2005, INTERSPEECH.

[12]  Wei Pan,et al.  SoundSense: scalable sound sensing for people-centric applications on mobile phones , 2009, MobiSys '09.

[13]  Kuldip K. Paliwal,et al.  Subspace independent component analysis using vector kurtosis , 2006, Pattern Recognit..

[14]  Seungjin Choi,et al.  Independent Component Analysis , 2009, Handbook of Natural Computing.

[15]  Ho-Young Jung,et al.  Speech feature extraction using independent component analysis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[16]  Vesa T. Peltonen,et al.  Audio-based context recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Richard S. Goldhor,et al.  Recognition of environmental sounds , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[19]  Antti Oulasvirta,et al.  Understanding Mobile Contexts , 2003, Mobile HCI.

[20]  Norbert Gyorbíró,et al.  An Activity Recognition System For Mobile Phones , 2009, Mob. Networks Appl..

[21]  D. Chakrabarti,et al.  A fast fixed - point algorithm for independent component analysis , 1997 .

[22]  Hyunsin Park,et al.  Integration of Phoneme-Subspaces Using ICA for Speech Feature Extraction and Recognition , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.