论文信息 - Unsupervised lip segmentation under natural conditions

Unsupervised lip segmentation under natural conditions

An unsupervised algorithm for speaker's lip segmentation is presented. A color video sequence of the speaker's face is acquired, under natural lighting conditions and without any particular make-up. First, a logarithmic color transform is performed from the RGB to HI (hue, intensity) color space and sequence dependant parameters are evaluated. Second, a statistical approach using Markov random field modeling segment the mouth shape using the red hue predominant region and motion in a spatiotemporal neighborhood. Simultaneously, a region of interest (ROI) is automatically extracted. Third, the speaker's lip shape is extracted from the final hue field with good quality results in this challenging situation.

Franck Luthon | Marc Liévin

[1] Jean-Charles Pinoli,et al. Image dynamic range enhancement and stabilization in the context of the logarithmic image processing model , 1995, Signal Process..

[2] Franck Luthon,et al. Lip features automatic extraction , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[3] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4] C. Benoît,et al. A set of French visemes for visual speech synthesis , 1994 .

[5] David G. Stork,et al. Speechreading by Humans and Machines , 1996 .

[6] S. Sridharan,et al. A syntactic approach to automatic lip feature extraction for speaker identification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).