On the effects of varying analysis parameters on an LPC-based isolated word recognizer

For practical hardware implementations of isolated-word recognition systems, it is important to understand how the feature set chosen for recognition affects the overall performance of the recognizer. In particular, we would like to determine whether hardware implementations could be simplified by reducing computation and memory requirements without significantly degrading overall system performance. The effects of system bandwidth (both in training and testing the recognizer) on the performance must also be considered since the conditions under which the system is used may be different than those under which it was trained. Finally, we must take account of the effects of finite word-length implementations, on both the computation of features and of distances, for the system to properly operate. In this paper we present the results of a study to determine the effects on recognition error rate of varying the basic analysis parameters of a linear predictive coding (LPC) model of speech. The results showed that system performance was best with an analysis parameter set equivalent to what is currently being used in the computer simulations, and that variations in parameter values that reduced computation also degraded performance, whereas variations in parameter values that increased computation did not lead to improved performance.

[1]  Jay G. Wilpon,et al.  Considerations in applying clustering techniques to speaker-independent word recognition. , 1979 .

[2]  B. Aldefeld,et al.  Automated directory listing retrieval system based on isolated word recognition , 1980, Proceedings of the IEEE.

[3]  G. White,et al.  Speech recognition experiments with linear predication, bandpass filtering, and dynamic programming , 1976 .

[4]  L. Rabiner,et al.  A simplified, robust training procedure for speaker trained, isolated word recognition systems , 1980 .

[5]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[6]  N. Dixon,et al.  A comparison of several speech-spectra classification methods , 1976 .

[7]  Aaron E. Rosenberg,et al.  Speaker-independent recognition of isolated words using clustering techniques , 1979 .

[8]  Aaron E. Rosenberg,et al.  Interactive clustering techniques for selecting speaker-independent reference templates for isolated word recognition , 1979 .

[9]  T.B. Martin,et al.  Practical applications of voice input to machines , 1976, Proceedings of the IEEE.

[10]  J. Makhoul,et al.  Quantization properties of transmission parameters in linear predictive systems , 1975 .

[11]  R. Gray,et al.  Distortion measures for speech processing , 1980 .

[12]  J. Makhoul,et al.  The Use of a Two-Pole Linear Prediction Model in Speech Recognition , 1973 .

[13]  R. Gray,et al.  Comparison of optimal quantizations of speech reflection coefficients , 1977 .