A New LIP-Reading Approach for Human Computer Interaction

Today, Human-Machine interaction represents a certain potential for autonomy especially of dependant people. Automatic Lip-reading system is one of the different assistive technologies for hearing impaired or elderly people. The need for an automatic lip-reading system is ever increasing. Extraction and reliable analysis of facial movements make up an important part in many multimedia systems such as videoconference, low communication systems, lip-reading systems. We can imagine, for example, a dependent person ordering a machine with an easy lip movement or by a simple visemes (visual phoneme) pronunciation. We present in this paper a new approach for lip localization and feature extraction in a speaker’s face. The extracted visual information is then classified in order to recognize the uttered viseme. We have developed our Automatic Lip Feature Extraction prototype (ALiFE). ALiFE prototype is evaluated with a multiple speakers under natural conditions. Experiments include a group of French visemes by different speakers. Results revealed that our system recognizes 92.50 % of French visemes.

[1]  Gerasimos Potamianos,et al.  An image transform approach for HMM based automatic lipreading , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[2]  Mark A. Clements,et al.  Visual speech feature extraction for improved speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Philippe Daubias Modèles a posteriori de la forme et de l'apparence des lèvres pour la reconnaissance automatique de la parole audiovisuelle , 2002 .

[4]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[5]  Stephen J. Cox,et al.  Audiovisual speech recognition using multiscale nonlinear image decomposition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[6]  E. Petajan,et al.  An improved automatic lipreading system to enhance speech recognition , 1988, CHI '88.

[7]  Yasuyuki Nakata,et al.  Lipreading method using color extraction method and eigenspace technique , 2004, Systems and Computers in Japan.

[8]  Patrice Delmas Extraction des contours des lèvres d'un visage parlant par contours actifs-application a la parole multimodale , 2000 .

[9]  Russell M. Mersereau,et al.  On merging hidden Markov models with deformable templates , 1995, Proceedings., International Conference on Image Processing.

[10]  Alice Caplier,et al.  Accurate and quasi-automatic lip tracking , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  Nicolas Eveno Segmentation des lèvres par un modèle déformable analytique , 2003 .

[12]  Gregory J. Wolff,et al.  Preprocessing video images for neural learning of lipreading , 1994, Other Conferences.

[13]  Nicolaos B. Karayiannis,et al.  Non-Euclidean c-means clustering algorithms , 2003, Intell. Data Anal..

[14]  Alexander H. Waibel,et al.  Towards Unrestricted Lip Reading , 2000, Int. J. Pattern Recognit. Artif. Intell..