An efficient front-end for automatic speech recognition

This paper deals with the introduction of an efficient speech front-end for automatic speech recognition. This front-end not only performs well, in comparison to the traditional and widely used MFCC, but is also efficiently implemented in a low-resource system. Furthermore, due to its desirable characteristics that allow near-perfect reconstruction of the speech signal, this front-end can directly be used for speech enhancement purposes before the recognition is carried out. Experimental results show that the new front-end is capable of speech recognition with comparable or superior results to MFCC both in clean and noisy conditions. Similar results were also obtained in sub-band speech recognition experiments.

[1]  S. Biyiksiz,et al.  Multirate digital signal processing , 1985, Proceedings of the IEEE.

[2]  R. Brennan,et al.  A flexible filterbank structure for extensive signal manipulations in digital hearing aids , 1998, ISCAS '98. Proceedings of the 1998 IEEE International Symposium on Circuits and Systems (Cat. No.98CH36187).

[3]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[4]  L. Rabiner,et al.  Multirate digital signal processing: Prentice-Hall, Inc. Englewood Cliffs, New Jersey 07362, 1983, 411 pp., ISBN 0-13-605162-6 , 1983 .

[5]  B. Mak,et al.  A mathematical relationship between full-band and multiband mel-frequency cepstral coefficients , 2002, IEEE Signal Processing Letters.

[6]  Hamid Sheikhzadeh,et al.  A low-resource, miniature implementation of the ETSI distributed speech recognition front-end , 2002, INTERSPEECH.

[7]  Hamid Sheikhzadeh,et al.  Highly oversampled subband adaptive filters for noise cancellation on a low-resource DSP system , 2002, INTERSPEECH.

[8]  Hervé Bourlard,et al.  A mew ASR approach based on independent processing and recombination of partial frequency bands , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[9]  Alexandros Potamianos,et al.  Multi-band speech recognition in noisy environments , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[10]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.