An information filter for voice prompt suppression

Modern speech enabled applications provide for dialog between a machine and one or more human users. The machine prompts the user with queries that are either prerecorded or synthesized on the fly. The human users respond with their own voices, and their speech is then recognized and understood by a human language understanding module. In order to achieve as natural an interaction as possible, the human user(s) must be allowed to interrupt the machine during a voice prompt. In this work, we compare two techniques for such voice prompt suppression. The first is a straightforward adaptation of a conventional Kalman filter, which has certain advantages over the normalized least squares algrithm in terms of robustness and speed of convergence. The second algorithm, which is novel in this work, is also based on a Kalman filter, but differs from the first in that the update or correction step is performed in information space and hence allows for the use of diagonal loading in order to control the growth of the subband filter coefficients, and thereby add robustness to the VPS.

[1]  Maria Hansson,et al.  A double-talk detector based on coherence , 1996, IEEE Trans. Commun..

[2]  Mohinder S. Grewal,et al.  Kalman Filtering: Theory and Practice , 1993 .

[3]  John McDonough,et al.  Distant Speech Recognition , 2009 .

[4]  Tamar Frankel [The theory and the practice...]. , 2001, Tijdschrift voor diergeneeskunde.

[5]  Marius Neag,et al.  Sub-band adaptive filtering for acoustic echo cancellation , 2009, 2009 European Conference on Circuit Theory and Design.

[6]  Dan Simon,et al.  Optimal State Estimation: Kalman, H∞, and Nonlinear Approaches , 2006 .

[7]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[8]  Jian Li,et al.  Subband doubletalk detector for acoustic echo cancellation systems , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[9]  Zhao Yue,et al.  A subband acoustic echo cancellation coupled with double talk detector , 2010, 2010 2nd International Conference on Signal Processing Systems.

[10]  Peter Vary,et al.  Frequency-domain adaptive Kalman filter for acoustic echo control in hands-free telephones , 2006, Signal Process..

[11]  Bhiksha Raj,et al.  On the combination of voice prompt suppression with maximum kurtosis beamforming , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[12]  Walter Kellermann Acoustic Echo Cancellation for Beamforming Microphone Arrays , 2001, Microphone Arrays.

[13]  Jacob Benesty,et al.  Double-talk robust VSS-NLMS algorithm for under-modeling acoustic echo cancellation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.