We propose and evaluate a novel method for enhancing performance of lips contour tracking, which is based on the concept of statistic shape models (ASM) and multi features. On the first image of the video sequence, lip region is detected using the Bayesian's rule in which lip color information is modeled by using the Gaussian mixture model (GMM) and the GMM is trained by expectation-maximization (EM) algorithm. The lip region is then used to initialize the lip shape model. A single feature-based ASM presents good performance only in particular conditions but gets stuck in local minima for noisy conditions (like beard, wrinkle, poor texture, low contrast between lip and skin, etc). To enhance the convergence, we propose to use 2 features: normal profile and grey level patches, and combine them by using a voting approach. The standard ASM is not able to take into account temporal information from previous frames therefore the lip contours are tracked by replacing the standard ASM with a hybrid active shape model (HASM) which is capable to take advantage of the temporal information. Initial experimental results on video sequences show that MF-HASM is more robust to local minimum problem and gives a higher accuracy than traditional single feature-based method in lip tracking problem.
[1]
M. Milgram,et al.
Multi features Active Shape Models for lip contours detection
,
2008,
2008 International Conference on Wavelet Analysis and Pattern Recognition.
[2]
Wen Gao,et al.
Constraint Shape Model Using Edge Constraint and Gabor Wavelet Based Search
,
2003,
AVBPA.
[3]
Paul A. Viola,et al.
Robust Real-Time Face Detection
,
2001,
Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.
[4]
Timothy F. Cootes,et al.
Active Shape Models-Their Training and Application
,
1995,
Comput. Vis. Image Underst..
[5]
Alan L. Yuille,et al.
Feature extraction from faces using deformable templates
,
2004,
International Journal of Computer Vision.
[6]
Philippe Daubias.
Modèles a posteriori de la forme et de l'apparence des lèvres pour la reconnaissance automatique de la parole audiovisuelle
,
2002
.
[7]
Timothy F. Cootes,et al.
Active Appearance Models
,
2001,
IEEE Trans. Pattern Anal. Mach. Intell..
[8]
Demetri Terzopoulos,et al.
Snakes: Active contour models
,
2004,
International Journal of Computer Vision.