Estimation of source location based on 2-D MUSIC and its application to speech recognition in cars

This paper proposes a speech recognition and an enhancement system for noisy car environments based on a microphone array. In the system, multiple microphones axe arranged in 2-dimensional space, surrounding the interior of a car, and the speaker's location is first estimated by our proposed HE (harmonic enhanced) 2-D MUSIC (MUltiple SIgnal Classification). Then, 2-D delay and sum (DS) is applied to enhance the target speech. Such pre-processing makes robust speech recognition in noisy car environments possible. In the proposed system, not only a driver, but also a fellow passenger can control car electronics by their voices no matter where they are. This is an advantage of the system as well. The results of the simulation and the preliminary experiment in a real car environment are presented to confirm the validity of our proposed system.

[1]  C.E. Mokbel,et al.  Automatic word recognition in cars , 1995, IEEE Trans. Speech Audio Process..

[2]  Hartmut R. Pfitzinger The collection of spoken language resources in car environments , 1998 .

[3]  Keiichi Tokuda,et al.  A very low bit rate speech coder using HMM-based speech recognition/synthesis techniques , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[4]  Sven Nordholm,et al.  Adaptive array noise suppression of handsfree speaker input in cars , 1993 .

[5]  Sven Nordholm,et al.  Adaptive microphone array employing calibration signals: an analytical evaluation , 1999, IEEE Trans. Speech Audio Process..

[6]  José B. Mariño,et al.  Speech recognition in a noisy car environment based on LP of the one-sided autocorrelation sequence and robust similarity measuring techniques , 1997, Speech Commun..

[7]  Y. Grenier Wideband source location through frequency-dependent modeling , 1994, IEEE Trans. Signal Process..

[8]  Jörg Meyer,et al.  Multi-channel speech enhancement in a car environment using Wiener filtering and spectral subtraction , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Ruxin Chen,et al.  A robust speech detection algorithm for speech activated hands-free applications , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).