Automatic combined lip segmentation in color images

Automatic speech recognition (ASR) performs well under restricted conditions. Lipreading is a main part of audio-visual speech recognition systems and an accurate algorithm for lip detection and motion tracking helps to improve the recognition rate efficiently. This paper proposes a new combined method from combination of a new method and red exclusion algorithm. Accuracy of the proposed method is verified by applying it to several images.

[1]  Lionel Revéret,et al.  A Viseme-based Approach to Labiometrics for Automatic Lipreading , 1997, AVBPA.

[2]  Satoshi Nakamura,et al.  Audio-visual speech translation with automatic lip syncqronization and face tracking based on 3-D head model , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Stephen J. Cox,et al.  Nonlinear scale decomposition based features for visual speech recognition , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[4]  Alan C. Bovik,et al.  Computer lipreading for improved accuracy in automatic speech recognition , 1996, IEEE Trans. Speech Audio Process..

[5]  Tsuhan Chen,et al.  Audio-visual integration in multimodal communication , 1998, Proc. IEEE.

[6]  Trent W. Lewis,et al.  Audio-Visual Speech Recognition Using Red Exclusion and Neural Networks , 2002, ACSC.

[7]  Allen A. Montgomery,et al.  Automatic optically-based recognition of speech , 1988, Pattern Recognit. Lett..

[8]  A. Adjoudani,et al.  On the Integration of Auditory and Visual Parameters in an HMM-based ASR , 1996 .

[9]  Juergen Luettin,et al.  Speechreading using shape and intensity information , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[10]  Eun-Jung Holden,et al.  Lip Tracking using Pattern Matching Snakes , 2002 .

[11]  Eric David Petajan,et al.  Automatic Lipreading to Enhance Speech Recognition (Speech Reading) , 1984 .

[12]  Juergen Luettin,et al.  Speechreading using Probabilistic Models , 1997, Comput. Vis. Image Underst..

[13]  Yongzhao Zhan,et al.  A real-time approach to the lip-motion extraction in video sequence , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[14]  B.P. Yuhas,et al.  Integration of acoustic and visual speech signals using neural networks , 1989, IEEE Communications Magazine.

[15]  Shinya Takahashi,et al.  Dialogue Experiment for Elderly People in Home Health Care System , 2003, TSD.

[16]  Emanuele Trucco,et al.  Computer and Robot Vision , 1995 .

[17]  Martin J. Russell,et al.  Integrating audio and visual information to provide highly robust speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.