Articulatory Features and Associated Production Models Statistical Speech Recognition

A statistical approach to speech recognition is outlined which draws close parallel with closed-loop human speech communication schematized as a joint process of encoding and decoding of linguistic messages. The encoder consists of the symbolically-valued overlapping articulatory feature model and of its interface to a nonlinear task-dynamic model of speech production. A general speech recognizer architecture based on optimal decoding strategy incorporating encoder-decoder interactions is described and discussed.

[1]  B. Lindblom Role of articulation in speech perception: clues from production. , 1996, The Journal of the Acoustical Society of America.

[2]  Peter D. Eimas,et al.  Perspectives on the study of speech , 1981 .

[3]  Li Deng,et al.  Integrated-multilingual speech recognition using universal phonological features in a functional speech production model , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  D. Klatt Review of selected models of speech perception , 1989 .

[5]  M. Hallet,et al.  Speech Recognition: A Model and a Program for Research* , 1998 .

[6]  Li Deng,et al.  Maximum-likelihood estimation for articulatory speech recognition using a stochastic target model , 1995, EUROSPEECH.

[7]  P. Denes The Speech Chain , 1963 .

[8]  A M Liberman,et al.  Perception of the speech code. , 1967, Psychological review.

[9]  P. Denes,et al.  The speech chain : the physics and biology of spoken language , 1963 .

[10]  Li Deng,et al.  Production models as a structural basis for automatic speech recognition , 1997, Speech Commun..

[11]  L Saltzman Elliot,et al.  A Dynamical Approach to Gestural Patterning in Speech Production , 1989 .

[12]  L. Deng Design of a feature‐based speech recognizer aiming at integration of auditory processing, signal modeling, and phonological structure of speech , 1993 .

[13]  R. Diehl,et al.  On the Objects of Speech Perception , 1989 .

[14]  Li Deng,et al.  Computational Models for Speech Production , 2018, Speech Processing.

[15]  C. Fowler An event approach to the study of speech perception from a direct realist perspective , 1986 .

[16]  William D. Marslen-Wilson,et al.  Lexical Representation and Process , 1991 .

[17]  Michael I. Jordan,et al.  Goal-based speech motor control: A theoretical framework and some preliminary data , 1995 .

[18]  Björn Lindblom,et al.  Explaining Phonetic Variation: A Sketch of the H&H Theory , 1990 .

[19]  Li Deng,et al.  Transitional speech units and their representation by regressive Markov states: applications to speech recognition , 1996, IEEE Trans. Speech Audio Process..

[20]  A. Liberman,et al.  The motor theory of speech perception revised , 1985, Cognition.