Speech enhancement using nullspace-based sound field control for barge-in free spoken dialogue interface

This paper describes a new small-scale interface for a barge-in free spoken dialogue system combining a multichannel sound field control and a microphone array, in which the response sound from the system can be canceled out at the microphone points. The conventional method inhibits the user from moving because the system forces the user to stay in the fixed position where the response sound is reproduced. However, since the proposed method doesn't arrange the control points for the reproduction of the response sound to the user, the user's move is allowed. Furthermore, relaxation of the strict reproduction for the response sound enables us to design a stable system with fewer loudspeakers than the conventional method. The proposed method shows higher performances in the speech recognition experiments

[1]  Biing-Hwang Juang,et al.  Hands-free telecommunications , 2001 .

[2]  Shoji Makino Stereophonic acoustic echo cancellation: An overview and recent solutions , 2001 .

[3]  Masato Miyoshi,et al.  Inverse filtering of room acoustics , 1988, IEEE Trans. Acoust. Speech Signal Process..

[4]  Kiyohiro Shikano,et al.  Unsupervised speaker adaptation based on HMM sufficient statistics in various noisy environments , 2003, INTERSPEECH.

[5]  Walter Kellermann,et al.  Wave-domain adaptive filtering: acoustic echo cancellation for full-duplex systems based on wave-field synthesis , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  K. Shikano,et al.  Sound Reproduction System Including Adaptive Compensation of Temperature Fluctuation Effect for Broad-Band Sound Control , 2002, IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences.

[7]  Kiyohiro Shikano,et al.  A new phonetic tied-mixture model for efficient decoding , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[8]  Eberhard Hänsler Acoustic echo and noise control: where do we come from - where do we go to? , 2001 .

[9]  Kiyohiro Shikano,et al.  Interface for barge-in free spoken dialogue system based on sound field control and microphone array , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[10]  Walter Kellermann,et al.  A real-time acoustic human-machine front-end for multimedia applications integrating robust adaptive beamforming and stereophonic acoustic echo cancellation , 2002, INTERSPEECH.

[11]  Shuichi Itahashi,et al.  Design and Creation of Speech and Text Corpora of Dialogue (Special Issue on Speech and Discourse Processing in Dialogue Systems) , 1993 .

[12]  Kiyohiro Shikano,et al.  Julius - an open source real-time large vocabulary recognition engine , 2001, INTERSPEECH.

[13]  Shuichi Itahashi,et al.  The design of the newspaper-based Japanese large vocabulary continuous speech recognition corpus , 1998, ICSLP.

[14]  Shuichi Itahashi,et al.  JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research , 1999 .