论文信息 - An investigation of PLP and IMELDA acoustic representations and of their potential for combination

An investigation of PLP and IMELDA acoustic representations and of their potential for combination

Two acoustic representations, integrated Mel-scale representation with LDA (IMELDA) and perceptual linear prediction-root power sums (PLP-RPS), both of which have given good results in speech recognition tests, are explored. IMELDA is examined in the context of some related representations. Results of speaker-dependent and independent tests with digits and the alphabet suggest that the optimum PLP order is high and that the effectiveness of PLP-RPS stems not from its modeling of perceptual properties but from its approximation to a desirable statistical property attained exactly by IMELDA. A combined PLP-IMELDA representation is found to be generally more effective than PLP-RPS, but an IMELDA representation derived directly from a filter-bank provides similar results to PLP-IMELDA at a lower computational cost.<<ETX>>

M. J. Hunt | S. M. Richardson | D. C. Bateman | A. Piau

[1] H. Hermansky,et al. Optimization of perceptually-based ASR front-end (automatic speech recognition) , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[2] Hynek Hermansky,et al. Perceptually based linear predictive analysis of speech , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3] Hynek Hermansky,et al. The effective second formant F2' and the vocal tract front-cavity , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[4] C. Lefebvre,et al. A comparison of several acoustic representations for speech recognition with degraded and undegraded speech , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[5] Lalit R. Bahl,et al. Speech recognition with continuous-parameter hidden Markov models , 1987, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[6] M. Hunt,et al. Speaker dependent and independent speech recognition experiments with an auditory model , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[7] Brian Hanson,et al. Robust speaker-independent word recognition using static, dynamic and acceleration features: experiments with Lombard and noisy speech , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[8] H. Hermansky,et al. An efficient speaker-independent automatic speech recognition by simulation of some properties of human auditory perception , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.