Pronouncing Rehabilitation of Hearing-Impaired Children Based on Chinese 3D Visual-Speech Database

Visual-speech thinking studied in the field of human-computer interaction is applied to voice assistant system. A new method based on 3D talking head driven by parameter is presented to make rehabilitation training and the speech rehabilitation problems of hearing-impaired children are resolved. Establish the 3D talking head model, and set up the 3D visual speech database by extracting parameters of the face, tongue and palate. By extracting parameters of speech database to drive 3D talking head, make pronunciation training for the deaf and no dumb children with hearing impairment, give the extraction method of face and tongue parameter, and introduce the specific implementation methods in details. Experiments show that the method can determine the difference position by comparing the standard speech database parameters and sequence parameters of trainers’ pronunciation to provide intuitionist information feedback for the trainees to improving the quality of their pronunciation.

[1]  Sheri Hunnicutt,et al.  A multi-language text-to-speech module , 1982, ICASSP.

[2]  C.H. Coker,et al.  A model of articulatory dynamics and control , 1976, Proceedings of the IEEE.

[3]  Steve C. Maddock,et al.  A Constraint-Based Approach to Visual Speech for a Mexican-Spanish Talking Head , 2008, Int. J. Comput. Games Technol..

[4]  Rolf Carlson,et al.  Experiments with voice modelling in speech synthesis , 1991, Speech Commun..

[5]  Jörgen Ahlberg Model-based coding : extraction, coding, and evaluation of face model parameters , 2002 .

[6]  Parke,et al.  Parameterized Models for Facial Animation , 1982, IEEE Computer Graphics and Applications.

[7]  Yu Luo,et al.  A multi-stream audio-video large-vocabulary Mandarin Chinese speech database , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[8]  Dominic W. Massaro,et al.  Development and experimentation with synthetic visible speech , 1994 .

[9]  Díbio Leandro Borges,et al.  Visual speech recognition: a solution from feature extraction to words classification , 2003, 16th Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI 2003).

[10]  Steve Maddock,et al.  A Mexican-Spanish Talking Head , 2008 .

[11]  Olov Engwall,et al.  Combining MRI, EMA and EPG measurements in a three-dimensional tongue model , 2003, Speech Commun..

[12]  R. Wilhelms-Tricarico Physiological modeling of speech production: methods for modeling soft-tissue articulators. , 1995, The Journal of the Acoustical Society of America.