Acoustic-phonetic decoding of speech

Several methods for acoustic-phonetic decoding are reviewed. Emphasis is placed on the need for mathematical methods for speech recognition. Several examples of statistical methods are described. The author presents several techniques for incorporating “speech knowledge” into these statistical models, and provides a simple formalism for using multiple knowledge sources in a coherent speech recognition system.

[1]  John Makhoul,et al.  Context-dependent modeling for acoustic-phonetic recognition of continuous speech , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Lalit R. Bahl,et al.  Decoding for channels with insertions, deletions, and substitutions with applications to speech recognition , 1975, IEEE Trans. Inf. Theory.

[3]  Stephen E. Levinson,et al.  Speaker independent isolated digit recognition using hidden Markov models , 1982, ICASSP.

[4]  S. Roucos,et al.  The role of word-dependent coarticulatory effects in a phoneme-based speech recognition system , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  R. Schwartz,et al.  Rapid speaker adaptation using a probabilistic spectral mapping , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Richard M. Schwartz,et al.  Improved hidden Markov modeling of phonemes for continuous speech recognition , 1984, ICASSP.

[7]  Hermann Ney,et al.  Dynamic programming speech recognition using a context-free grammar , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Taras K. Vintsiuk Generative grammars and dynamic programming in speech recognition with learning , 1976, ICASSP.

[9]  John Makhoul,et al.  Continuous speech recognition results of the BYBLOS system on the DARPA 1000-word resource management database , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[10]  Richard M. Schwartz,et al.  A segment vocoder at 150 b/s , 1983, ICASSP.

[11]  Patti Price,et al.  The DARPA 1000-word resource management database for continuous speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[12]  S. Roucos,et al.  Segment quantization for very-low-rate speech coding , 1982, ICASSP.

[13]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  S. Roucos,et al.  A stochastic segment model for phoneme-based continuous speech recognition , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  L. Baum,et al.  An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology , 1967 .