A Maximum Likelihood Approach to Continuous Speech Recognition

Speech recognition is formulated as a problem of maximum likelihood decoding. This formulation requires statistical models of the speech production process. In this paper, we describe a number of statistical models for use in speech recognition. We give special attention to determining the parameters for such models from sparse data. We also describe two decoding methods, one appropriate for constrained artificial languages and one appropriate for more realistic decoding tasks. To illustrate the usefulness of the methods described, we review a number of decoding results that have been obtained with them.

[1]  Claude E. Shannon,et al.  Prediction and Entropy of Printed English , 1951 .

[2]  S. Ariel,et al.  Introduction to Theoretical Linguistics. , 1968 .

[3]  F. Jelinek Fast sequential decoding algorithm using a stack , 1969 .

[4]  Nils J. Nilsson,et al.  Problem-solving methods in artificial intelligence , 1971, McGraw-Hill computer science series.

[5]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[6]  Allen Newell,et al.  Speech understanding systems : Final report of a study group , 1973 .

[7]  Lalit R. Bahl,et al.  Decoding for channels with insertions, deletions, and substitutions with applications to speech recognition , 1975, IEEE Trans. Inf. Theory.

[8]  J. Baker,et al.  The DRAGON system--An overview , 1975 .

[9]  N. R. Dixon,et al.  Preliminary results on the performance of a system for the automatic recognition of continuous speech , 1976, ICASSP.

[10]  R. Bakis Continuous speech recognition via centisecond acoustic states , 1976 .

[11]  Bruce T. Lowerre,et al.  The HARPY speech recognition system , 1976 .

[12]  Lalit R. Bahl,et al.  Automatic recognition of continuously spoken sentences from a finite state grammer , 1978, ICASSP.

[13]  Lalit R. Bahl,et al.  Recognition of continuously read natural corpus , 1978, ICASSP.

[14]  Lalit R. Bahl,et al.  Recognition results for several experimental acoustic processors , 1979, ICASSP.

[15]  Janet M. Baker,et al.  Performance statistics of the HEAR acoustic processor , 1979, ICASSP.

[16]  Lalit R. Bahl,et al.  Further results on the recognition of a continuously read natural corpus , 1980, ICASSP.

[17]  Frederick Jelinek,et al.  Interpolated estimation of Markov source parameters from sparse data , 1980 .