Robust Environmental Sound Recognition With Fast Noise Suppression for Home Automation

This paper proposes a robust environmental sound recognition system using a fast noise suppression approach for home automation applications. The system comprises a fast subspace-based noise suppression module and a sound classification module. For the noise suppression module, we propose a noise suppression method that applies fast subspace approximations in the wavelet domain. We show that this method offers a lower computational cost than conventional methods. In the sound classification module, we use a feature extraction method that is also based on the wavelet subspace, derived from seventeen critical bands in a signal's wavelet packet transform. Furthermore, we create a multiclass support vector machine by employing probability product kernels. The experimental results for ten classes of various environmental sounds show that the proposed system offers robust performance in environmental sound recognition tasks.

[1]  Jhing-Fa Wang,et al.  Home environmental sound recognition based on MPEG-7 features , 2003, 2003 46th Midwest Symposium on Circuits and Systems.

[2]  Sridhar Krishnan,et al.  Time–Frequency Matrix Feature Extraction and Classification of Environmental Audio Signals , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Brian G. Ferguson,et al.  Acoustic cueing for surveillance and security applications , 2006, SPIE Defense + Commercial Sensing.

[4]  Yariv Ephraim,et al.  A signal subspace approach for speech enhancement , 1995, IEEE Trans. Speech Audio Process..

[5]  Saeed Gazor,et al.  An adaptive KLT approach for speech enhancement , 2001, IEEE Trans. Speech Audio Process..

[6]  Shrikanth Narayanan,et al.  Environmental Sound Recognition With Time–Frequency Audio Features , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Brigitte Meillon,et al.  The sweet-home project: Audio technology in smart homes to improve well-being and reliance , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[8]  K. H. Barratt Digital Coding of Waveforms , 1985 .

[9]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[10]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[11]  Amara Lynn Graps,et al.  An introduction to wavelets , 1995 .

[12]  Ronald R. Coifman,et al.  Entropy-based algorithms for best basis selection , 1992, IEEE Trans. Inf. Theory.

[13]  Rolf Bardeli,et al.  Similarity Search in Animal Sound Databases , 2009, IEEE Transactions on Multimedia.

[14]  Ning Liu,et al.  Bathroom Activity Monitoring Based on Sound , 2005, Pervasive.

[15]  Jhing-Fa Wang,et al.  Critical Band Subspace-Based Speech Enhancement Using SNR and Auditory Masking Aware Technique , 2007, IEICE Trans. Inf. Syst..

[16]  Tadahiro Kuroda,et al.  Speech "Siglet" Detection for Business Microscope (concise contribution) , 2008, 2008 Sixth Annual IEEE International Conference on Pervasive Computing and Communications (PerCom).

[17]  Stéphane Mallat,et al.  On denoising and best signal representation , 1999, IEEE Trans. Inf. Theory.

[18]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[19]  Peter No,et al.  Digital Coding of Waveforms , 1986 .

[20]  N. Noury,et al.  Challenges in the processing of audio channels for Ambient Assisted Living , 2010, The 12th IEEE International Conference on e-Health Networking, Applications and Services.

[21]  Jhing-Fa Wang,et al.  Robust Environmental Sound Recognition for Home Automation , 2008, IEEE Transactions on Automation Science and Engineering.

[22]  Jhing-Fa Wang,et al.  Speech Enhancement Using Perceptual Wavelet Packet Decomposition and Teager Energy Operator , 2004, J. VLSI Signal Process..

[23]  Michel Vacher,et al.  First steps in data fusion between a multichannel audio acquisition and an information system for home healthcare , 2003, Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE Cat. No.03CH37439).

[24]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[25]  Andreas Spanias,et al.  Segmentation, Indexing, and Retrieval for Environmental and Natural Sounds , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[26]  Chang-Hong Lin,et al.  Gabor-Based Nonuniform Scale-Frequency Map for Environmental Sound Classification in Home Automation , 2014, IEEE Transactions on Automation Science and Engineering.

[27]  Keikichi Hirose,et al.  An automatic approach to virtual living based on environmental sound cues , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[28]  Tommaso Melodia,et al.  TinyEARS: spying on house appliances with audio sensor nodes , 2010, BuildSys '10.

[29]  Guodong Guo,et al.  Content-based audio classification and retrieval by support vector machines , 2003, IEEE Trans. Neural Networks.

[30]  H.G. Okuno,et al.  Computational Auditory Scene Analysis and its Application to Robot Audition , 2004, 2008 Hands-Free Speech Communication and Microphone Arrays.

[31]  Søren Holdt Jensen,et al.  Reduction of broad-band noise in speech by truncated QSVD , 1995, IEEE Trans. Speech Audio Process..

[32]  S. Mallat A wavelet tour of signal processing , 1998 .

[33]  Tadahiro Kuroda,et al.  Speaker Siglet Detection for Business Microscope , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[34]  Lie Lu,et al.  A flexible framework for key audio effects detection and auditory context inference , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[35]  A. Fleury,et al.  Sound and speech detection and classification in a Health Smart Home , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[36]  George Carayannis,et al.  Speech enhancement from noise: A regenerative approach , 1991, Speech Commun..

[37]  Tony Jebara,et al.  Probability Product Kernels , 2004, J. Mach. Learn. Res..

[38]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[39]  Tamer Nadeem,et al.  EnergySniffer: Home energy monitoring system using smart phones , 2012, 2012 8th International Wireless Communications and Mobile Computing Conference (IWCMC).

[40]  Jhing-Fa Wang,et al.  Noise suppression based on approximate KLT with wavelet packet expansion , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[41]  M. Victor Wickerhauser,et al.  Adapted wavelet analysis from theory to software , 1994 .