论文信息 - Design of a linguistic statistical decoder for the recognition of continuous speech

Design of a linguistic statistical decoder for the recognition of continuous speech

Most current attempts at automatic speech recognition are formulated in an artificial intelligence framework. In this paper we approach the problem from an information-theoretic point of view. We describe the overall structure of a linguistic statistical decoder (LSD) for the recognition of continuous speech. The input to the decoder is a string of phonetic symbols estimated by an acoustic processor (AP). For each phonetic string, the decoder finds the most likely input sentence. The decoder consists of four major subparts: 1) a statistical model of the language being recognized; 2) a phonemic dictionary and statistical phonological rules characterizing the speaker; 3) a phonetic matching algorithm that computes the similarity between phonetic strings, using the performance characteristics of the AP; 4) a word level search control. The details of each of the subparts and their interaction during the decoding process are discussed.

[1] R. Alter,et al. Utilization of contextual constraints in automatic speech recognition , 1968 .

[2] F. Jelinek. Fast sequential decoding algorithm using a stack , 1969 .

[3] L. Baum,et al. A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[4] John Makhoul,et al. Organizing a System for Continuous Speech Understanding , 1973 .

[5] Charles C. Tappert,et al. Application of sequential decoding for converting phonetic to graphic representation in automatic recognition of continuous speech(ARCS) , 1973 .

[6] Raj Reddy,et al. The HEARSAY Speech Understanding System , 1974 .

[7] John Cocke,et al. Optimal decoding of linear codes for minimizing symbol error rate (Corresp.) , 1974, IEEE Trans. Inf. Theory.

[8] Lalit R. Bahl,et al. Decoding for channels with insertions, deletions, and substitutions with applications to speech recognition , 1975, IEEE Trans. Inf. Theory.

[9] Donald E. Walker,et al. Speech Understanding Through Syntactic and Semantic Analysis , 1973, IEEE Transactions on Computers.