论文信息 - Transform-based multi-feature optimization for robust distributed speech recognition

Transform-based multi-feature optimization for robust distributed speech recognition

This paper describes a noise-robust Distributed Speech Recognition (DSR) front-end using a combination of conventional Mel-cepstral Coefficient (MFCC) and Line Spectral Frequencies (LSF). These features are adequately transformed and reduced in a multi-stream scheme using Karhunen-Loeve Transform (KLT). We investigate the performance of a new front-end DSR in terms of recognition accuracy in adverse conditions as well as in terms of dimensionality reduction. Our results showed that for highly noisy speech, the proposed transformation scheme leads to a significant improvement in recognition accuracy on Aurora 2 task.

M. Boudraa | B. Boudraa | D. Addou | S. A. Selouani

[1] Kuldip K. Paliwal,et al. Class-dependent PCA, MDC and LDA: A combined classifier for pattern classification , 2006, Pattern Recognit..

[2] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[3] Yifan Gong,et al. Speech recognition in noisy environments: A survey , 1995, Speech Commun..

[4] Zheng-Hua Tan,et al. Automatic speech recognition on mobile devices and over communication networks , 2008 .

[5] I. Jolliffe. Principal Component Analysis , 2002 .

[6] Y. Hu,et al. A subspace approach for enhancing speech corrupted by colored noise , 2002, IEEE Signal Process. Lett..

[7] Sid-Ahmed Selouani,et al. A noise-robust front-end for distributed speech recognition in mobile communications , 2007, Int. J. Speech Technol..

[8] Keinosuke Fukunaga,et al. Introduction to Statistical Pattern Recognition , 1972 .