Noise Detection and Classification in Speech Signals with Boosting

This paper presents a novel method to detect and classify sudden noises in speech signals. There are many sudden and short-period noises in natural environments, such as inside a car. If a speech recognition system can detect sudden noises, it will make it possible for the system to ask the speaker to repeat the same utterance so that the speech data will be clean. If clean speech data can be input, it will help prevent system operation errors. In this paper, we tried to detect and classify sudden noises in user's utterances using Boosting. Boosting can create a complex, non-linear boundary that determines whether the observed signal is speech, noise1, noise2, or so on. In our experiments, the proposed method achieved good performance in comparison to a conventional method based on the GMM (Gaussian Mixture Model).

[1]  Satoshi Nakamura,et al.  Particle filter based non-stationary noise tracking for robust speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[2]  Satoshi Nakamura,et al.  Acoustical Sound Database in Real Environments for Sound Scene Understanding and Hands-Free Speech Recognition , 2000, LREC.

[3]  Irina Illina,et al.  On-line frame-synchronous compensation of non-stationary noise , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[4]  Satoshi Nakamura,et al.  HMM COmposition-based rapid model adaptation using a priori noise GMM adaptation evaluation on Aurora2 corpus , 2002, INTERSPEECH.

[5]  C. R. Henson Conclusion , 1969 .

[6]  S. Furui,et al.  FHMM for Robust Speech Recognition in Home Environment , 2006 .

[7]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[9]  Kiyohiro Shikano,et al.  Noise robust real world spoken dialogue system using GMM based rejection of unintended inputs , 2004, INTERSPEECH.