A speaker-independent speech-recognition system based on linear prediction

This paper describes a speaker-independent speech-recognition system using autoregression (linear prediction) on speech samples. Isolated words from a standard 40-word reading test vocabulary are spoken by 25 different speakers. A reference pattern for each word is stored as coefficients of the Yule-Walker equations for 50 consecutive overlapped time windows. Various distance measures are then proposed and evaluated in terms of accuracy of recognition and speed of computation. The best measure gives 90.3 percent rate of recognition. Both the nearest-neighbor and K-nearest-neighbor algorithms are used in the decision scheme implemented. The computation is minimized by making sequential decisions after a fixed number of iterations. It is observed that computationally this distance measure coupled with a nonlinear time-warped function for matching of windows gives optimal results. The number of speakers was then increased to 105 to show the statistical significance of the results obtained in this project. The recognition rate obtained with the best procedure for 105 speakers was 89.2 percent. The recognition time for this procedure was 9.8 seconds per utterance.