A mew ASR approach based on independent processing and recombination of partial frequency bands

In the framework of hidden Markov models (HMM) or hybrid HMM/artificial neural network (ANN) systems, we present a new approach towards automatic speech recognition (ASR). The general idea is to split the whole frequency band (represented in terms of critical bands) into a few sub-bands on which different recognizers are independently applied and then recombined at a certain speech unit level to yield global scores and a global recognition decision. The preliminary results presented in this paper show that such an approach, even using quite simple recombination strategies, can yield at least comparable performance on clean speech while providing better robustness in the case of noisy speech.

[1]  Roger K. Moore,et al.  Hidden Markov model decomposition of speech and noise , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[2]  Paul Duchnowski,et al.  A new structure for automatic speech recognition , 1993 .

[3]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[4]  Jont B. Allen,et al.  How do humans process and recognize speech? , 1994, IEEE Trans. Speech Audio Process..

[5]  Misha Pavel,et al.  Towards ASR on partially corrupted speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[6]  Hynek Hermansky,et al.  Towards subband-based speech recognition , 1996, 1996 8th European Signal Processing Conference (EUSIPCO 1996).