A new method is presented for adapting the HMMs of a speech recognition system to the condition of a hands-free speech input in a room environment. The reverberation in a room usually has a bad effect on the performance of a recognition system. Reverberation causes an artificial extension of acoustic excitations what gets visible as so called reverberation tail when looking at the envelope of the short-term energy over the whole frequency range or in subbands. The approach is based on the assumption that the acoustic excitation of a speech segment, as modeled by an HMM state, will be seen as attenuated versions at successive HMM states. Adding this attenuated excitations in the spectral domain at each HMM state leads to a considerable improvement of the recognition performance. Furthermore a new approach is presented to adapt the Delta parameters that are usually taken as additional acoustic features. The efficiency of both new techniques has been proved by some experiments on isolated and connected word recognition with the TIDigits speech data base.
[1]
T. Houtgast,et al.
Predicting speech intelligibility in rooms from the modulation transfer function, I. General room acoustics
,
1980
.
[2]
Hans-Günter Hirsch,et al.
The simulation of realistic acoustic input scenarios for speech recognition systems
,
2005,
INTERSPEECH.
[3]
Nelson Morgan,et al.
Perceptually inspired signal processing strategies for robust speech recognition in reverberant environments
,
1998
.
[4]
Tomohiro Nakatani,et al.
Efficient blind dereverberation framework for automatic speech recognition
,
2005,
INTERSPEECH.
[5]
Shigeki Sagayama,et al.
Model adaptation by state splitting of HMM for long reverberation
,
2005,
INTERSPEECH.
[6]
Hans-Günter Hirsch.
HMM adaptation for applications in telecommunication
,
2001,
Speech Commun..