Statistical estimators for use in automatic lip segmentation

The past decade has seen a considerable increase in interest in the field of facial feature extraction. The primary reason for this is the variety of uses, in particular of the mouth region, in communicating important information about an individual which can in turn be used in a wide array of applications. The shape and dynamics of the mouth region convey the content of a communicated message, useful in applications involving speech processing as well as man-machine user interfaces. The mouth region can also be used as a parameter in a biometric verification system. Extraction of the mouth region from a face often uses lip contour processing to achieve these objectives. Thus, solving the problem of reliably segmenting the lip region given a talking face image is critical. This paper compares the use of statistical estimators, both robust and non-robust, when applied to the problem of automatic lip region segmentation. It then compares the results of the two systems with a state-of-the art method for lip segmentation.

[1]  Ramos Sanchez,et al.  Aspects of facial biometrics for verification of personal identity , 2000 .

[2]  P. L. Davies,et al.  The asymptotics of Rousseeuw's minimum volume ellipsoid estimator , 1992 .

[3]  Josef Kittler,et al.  Segmentation of lip pixels for lip tracker initialisation , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[4]  Alice Caplier,et al.  A parametric model for realistic lip segmentation , 2002, 7th International Conference on Control, Automation, Robotics and Vision, 2002. ICARCV 2002..

[5]  Alan Wee-Chung Liew,et al.  Segmentation of color lip images by spatial fuzzy clustering , 2003, IEEE Trans. Fuzzy Syst..

[6]  Andrew Blake,et al.  Accurate, real-time, unadorned lip tracking , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[7]  Jean-Michel Jolion,et al.  Robust Clustering with Applications in Computer Vision , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Jiri Matas,et al.  XM2VTSDB: The Extended M2VTS Database , 1999 .

[9]  Stephen E. Levinson,et al.  Speaker independent audio-visual speech recognition , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[10]  Alan Wee-Chung Liew,et al.  A new optimization procedure for extracting the point-based lip contour using active shape model , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[11]  P. Rousseeuw,et al.  A fast algorithm for the minimum covariance determinant estimator , 1999 .

[12]  M. Jhun,et al.  Asymptotics for the minimum covariance determinant estimator , 1993 .

[13]  Franck Luthon,et al.  A hierarchical segmentation algorithm for face analysis. Application to lipreading , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[14]  Christoph Pesch Computation of the Minimum Covariance Determinant Estimator , 1999 .

[15]  Eric David Petajan,et al.  Automatic Lipreading to Enhance Speech Recognition (Speech Reading) , 1984 .