Lip detection and tracking

Seeing the talker's lips in addition to audition can improve speech understanding which is rather based on lip shape temporal evolution than on absolute mouth shape. We propose a totally automatic algorithm which can extract lip shape over an image sequence. The algorithm does not require any make-up or markers and works under natural lighting conditions. The lip detection algorithm uses an active shape model to describe the mouth. After a training step, the mouth model is iteratively deformed under constraints according to spatiotemporal energies. The robust prior detection of mouth corners and Cupidon's arch yields the automatic positioning of the initial shape which is very difficult and must be as accurate as possible. Temporal information integration comes from the definition of Kalman filters on the independent mouth parameters. Such filtering gives an initial shape close to the final one which speeds up the convergence rate. We point out on the behaviour of our algorithm when a transition open mouth/closed mouth or closed mouth/open mouth occurs.

[1]  Richard P. Wildes A measure of motion salience for surveillance applications , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[2]  Timothy F. Cootes,et al.  Automatic Interpretation and Coding of Face Images Using Flexible Models , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Anil K. Jain,et al.  Deformable template models: A review , 1998, Signal Process..

[4]  Javier R. Movellan,et al.  Dynamic Features for Visual Speechreading: A Systematic Comparison , 1996, NIPS.

[5]  Ioannis Pitas,et al.  A novel method for automatic face segmentation, facial feature extraction and tracking , 1998, Signal Process. Image Commun..

[6]  Charles Kervrann,et al.  Learning probabilistic deformation models from image sequences , 1998, Signal Process..

[7]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[8]  Yochai Konig,et al.  "Eigenlips" for robust speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.