A multi-stream speech recognition system based on the estimation of stream weights

A multi-stream speech recognition framework based on the estimation of stream weights is proposed for robust speech recognition. First, two complementary acoustic features, MFCCs and LPCCs, were selected. Second, we modeled them separately by using Hidden Markov Models (HMMs), furthermore, formed two streams of this system. Last, we combined the likelihood outputs of the above two systems with weighting technique and obtained a better performance. Here we present a novel algorithm for computing the stream weights of the two feature streams based on the computation of intra-and inter-class distances. Experimental results obtained on Chinese Academy of Science speech database show that this system yields better recognition performance in all conditions. Using this multi-stream framework, we found that the word error rate was decreased by 5%.

[1]  G. Aguilar,et al.  Multimodal biometric system using fingerprint , 2007, 2007 International Conference on Intelligent and Advanced Systems.

[2]  James R. Glass,et al.  Flexible Multi-Stream Framework for Speech Recognition using Multi-Tape Finite-State Transducers , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[3]  Bayya Yegnanarayana,et al.  Combining evidence from residual phase and MFCC features for speaker recognition , 2006, IEEE Signal Processing Letters.

[4]  Hervé Bourlard,et al.  Non-Stationary Multi-Channel (Multi-Stream) Processing Towards Robust and Adaptive ASR , 1999 .

[5]  Alexandros Potamianos,et al.  Stream Weight Computation for Multi-Stream Classifiers , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[6]  Alexandros Potamianos,et al.  Unsupervised Stream-Weights Computation in Classification and Recognition Tasks , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Heidi Christensen,et al.  Employing heterogeneous information in a multi-stream framework , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[8]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..