Impulse response estimation for robust speech recognition in a reverberant environment

This paper refers to a voice-enabled smart-home scenario, for which contaminated speech is produced to train a distant-speech recognition system. The impulse response measurement process is investigated, with a specific focus on its impact on speech recognition performance. Experimental results, related to a phone-loop and to a word-loop task, show that a significant change of performance is obtained when using different techniques for impulse response estimation. In particular, the best performance is obtained when an exponential sine sweep excitation sequence is used, with a proper choice of its length and of the energy with which it is propagated in the environment.