In this paper we apply a model-based compensation method to cancel the effect of the additive noise in Automatic Speech Recognition systems. The method is formulated in a statistical framework in order to perform the optimal compensation of the noise effect given the observed noisy speech, a model describing the statistics of the speech recorded in a clean reference environment and the estimation of the noise in the noisy recognition environment. The noise is estimated using the first frames of the sentence to be recognized and a frame-by-frame noise compensation algorithm is performed, so that the compensation procedure does not constrain real-time speech recognition systems and is compatible with emerging technologies based on distributed speech recognition. We have performed recognition experiments under noise conditions using the AURORA II database for the recognition tasks developed for this database as a standard reference. Experiments have been carried out including both, clean and multicondition training approaches. The experimental results show the improvements in the recognition performance when the proposed model-based compensation method is applied.
[1]
Jerome R. Bellegarda,et al.
Statistical techniques for robust ASR: review and perspectives
,
1997,
EUROSPEECH.
[2]
Richard M. Stern,et al.
COMPENSATION FOR ENVIRONMENTAL DEGRADATION IN AUTOMATIC SPEECH RECOGNITION
,
1999
.
[3]
Pedro J. Moreno,et al.
A new algorithm for robust speech recognition: the delta vector taylor series approach
,
1997,
EUROSPEECH.
[4]
Mark J. F. Gales,et al.
HMM recognition in noise using parallel model combination
,
1993,
EUROSPEECH.
[5]
Saeed Vaseghi,et al.
Speech recognition in noisy environments
,
1992,
ICSLP.
[6]
Richard Lippmann,et al.
A comparison of signal processing front ends for automatic word recognition
,
1995,
IEEE Trans. Speech Audio Process..
[7]
Yifan Gong,et al.
Speech recognition in noisy environments: A survey
,
1995,
Speech Commun..