论文信息 - Optimization of perceptually-based ASR front-end (automatic speech recognition)

Optimization of perceptually-based ASR front-end (automatic speech recognition)

Several recently proposed automatic speech recognition (ASR) front-ends are experimentally compared for speaker-dependent and cross-speaker ASR. The perceptually based linear predictive front-end yields the highest accuracies. By modifying its sensitivity to spectral peaks and to spectral tilt and by utilizing the speech dynamics the authors further improve, by about 10%, its error rate in speaker-independent ASR.<<ETX>>

H. Hermansky | J. C. Junqua

[1] Jean-Claude Junqua,et al. Evaluation of ASR front ends in speaker-dependent and speaker-independent recognition , 1987 .

[2] W. Huggins. A Phase Principle for Complex‐Frequency Analysis and Its Implications in Auditory Theory , 1952 .

[3] B. Yegnanarayana. Formant extraction from linear‐prediction phase spectra , 1978 .

[4] Kuldip K. Paliwal,et al. On the performance of the quefrency-weighted cepstral coefficients in vowel recognition , 1982, Speech Commun..

[5] Y. Tohkura. Speaker‐independent recognition of isolated digits using a weighted cepstral distance , 1985 .

[6] Sadaoki Furui,et al. Speaker-independent isolated word recognition using dynamic features of speech spectrum , 1986, IEEE Trans. Acoust. Speech Signal Process..

[7] Hynek Hermansky,et al. Low-dimensional representation of vowels based on all-pole modeling in the psychophysical domain , 1985, Speech Commun..