论文信息 - Compact Representation of Speech Using 2-D Cepstrum - An Application to Slovak Digits Recognition

Compact Representation of Speech Using 2-D Cepstrum - An Application to Slovak Digits Recognition

HMM speech recogniser with a small number of acoustic observations based on 2-D cepstrum (TDC) is proposed. TDC represents both static and dynamic features of speech implicitly in matrix form. It is shown that TDC analysis enables a compact representation of speech signals. Thus a great advantage of the proposed model is a massive reduction of speech features used for recognition what lessens computational and memory requirements, so it may be favourable for limited-power ASR applications. Experiments on isolated Slovak digits recognition task show that the method gives comparable results as the conventional MFCC approach. For speech degraded by additive white noise, it reaches better performance than the MFCC method.

Michal Kuba | Roman Jarina | Martin Paralic

[1] Saeed Vaseghi,et al. Speech modelling using cepstral-time feature matrices and hidden Markov models , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[2] Douglas D. O'Shaughnessy,et al. Interacting with computers by voice: automatic speech recognition and synthesis , 2003, Proc. IEEE.

[3] Toshiyuki Sakai,et al. Spoken-word recognition using dynamic features analysed by two-dimensional cepstrum , 1989 .

[4] Chin-Teng Lin,et al. GA-based noisy speech recognition using two-dimensional cepstrum , 2000, IEEE Trans. Speech Audio Process..

[5] Ben P. Milner. Cepstral-time matrices and LDA for improved connected digit and sub-word recognition accuracy , 1997, EUROSPEECH.

[6] Misha Pavel,et al. On the importance of various modulation frequencies for speech recognition , 1997, EUROSPEECH.

[7] Hsiao-Chuan Wang,et al. A study of the two-dimensional cepstrum approach for speech recognition , 1992 .