ALE for robots! A single-channel approach to robot self-noise cancellation

In this paper we investigate an Adaptive Line Enhancer (ALE) for the cancellation of robot self-noise. In contrast to many other methods, it requires only a single microphone and can be combined with any other single- and multi-channel noise reduction method. The proposed ALE is implemented in the frequency domain (FDALE) and performs noise cancellation with respect to magnitude and phase. It therefore has the potential to reduce noise components without introducing distortions of the target signal. We combine the ALE with a traditional single-channel noise reduction filter, where the former cancels predictable noise components and the latter suppresses the random noise components. We apply this approach to an automatic speech recognition task and show that significant improvements can be obtained.

[1]  Jesper Jensen,et al.  An evaluation of objective measures for intelligibility prediction of time-frequency weighted noisy speech. , 2011, The Journal of the Acoustical Society of America.

[2]  Akinori Ito,et al.  Internal noise suppression for speech recognition by small robots , 2005, INTERSPEECH.

[3]  Kurt Keutzer,et al.  Efficient manycore CHMM speech recognition for audiovisual and multistream data , 2010, INTERSPEECH.

[4]  Rainer Martin,et al.  An evaluation of noise power spectral density estimation algorithms in adverse acoustic environments , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Rainer Martin,et al.  On mutual information as a measure of speech intelligibility , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Marc Moonen,et al.  Spatially pre-processed speech distortion weighted multi-channel Wiener filtering for noise reduction , 2003, Signal Process..

[7]  Salina Abdul Samad,et al.  A Review of Adaptive Line Enhancers for Noise Cancellation , 2012 .

[8]  Pierre Blazevic,et al.  Mechatronic design of NAO humanoid , 2009, 2009 IEEE International Conference on Robotics and Automation.

[9]  Walter Kellermann,et al.  Phase-optimized K-SVD for signal extraction from underdetermined multichannel sparse mixtures , 2014, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Jun-ichi Imura,et al.  Ego noise suppression of a robot using template subtraction , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[12]  Keisuke Nakamura,et al.  Assessment of general applicability of ego noise estimation , 2011, 2011 IEEE International Conference on Robotics and Automation.

[13]  Kiyohiro Shikano,et al.  Semi-blind suppression of internal noise for hands-free robot spoken dialog system , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Jesper Jensen,et al.  MMSE based noise PSD tracking with low complexity , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[15]  Jon Barker,et al.  An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.

[16]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[18]  Rainer Martin,et al.  A Frequency-Domain Adaptive Line Enhancer With Step-Size Control Based on Mutual Information for Harmonic Noise Reduction , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[19]  Walter Kellermann,et al.  Challenges in Acoustic Signal Enhancement for Human-Robot Communication , 2014, ITG Symposium on Speech Communication.

[20]  Mitsuru Ishizuka,et al.  Speech recognition for a robot under its motor noises by selective application of missing feature theory and MLLR , 2006, SAPA@INTERSPEECH.

[21]  Rainer Martin,et al.  Objective Intelligibility Measures Based on Mutual Information for Speech Subjected to Speech Enhancement Processing , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[22]  Kazuhiro Nakadai,et al.  Ego-motion noise suppression for robots based on Semi-Blind Infinite Non-negative Matrix Factorization , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).