Extraction automatique de contour de lèvre à partir du modèle CLNF (Automatic lip contour extraction using CLNF model)[In French]

Dans cet article nous proposons une nouvelle solution pour extraire le contour interne des levres d'un locuteur sans utiliser d'artifices. La methode s'appuie sur un algorithme recent d'extraction du contour de visage developpe en vision par ordinateur, CLNF pour Constrained Local Neural Field. Cet algorithme fournit en particulier 8 points caracteristiques delimitant le contour interne des levres. Applique directement a nos donnees audio-visuelles du locuteur, le CLNF donne de tres bons resultats dans environ 70% des cas. Des erreurs subsistent cependant pour le reste des cas. Nous proposons des solutions pour estimer un contour raisonnable des levres a partir des points fournis par CLNF utilisant l'interpolation par spline permettant de corriger ses erreurs et d'extraire correctement les parametres labiaux classiques. Les evaluations sur une base de donnees de 179 images confirment les performances de notre algorithme. ABSTRACT Automatic lip contour extraction using CLNF model. In this paper a new approach to extract the inner contour of the lips of a speaker without using artifices is proposed. The method is based on a recent face contour extraction algorithm developed in computer vision. This algorithm, which is called Constrained Local Neural Field (CLNF), provides 8 characteristic points (landmarks) defining the inner contour of the lips. Applied directly to our audiovisual data of the speaker, CLNF gives very satisfactory results in about 70% of cases. However, errors exist for the remaining cases. We offer solutions for estimating a reasonable inner lip contour from the landmarks provided by CLNF based on spline to correct its bad behaviors and to extract the suitable labial parameters A, B and S. The evaluations on a 179 image database confirm performance of our algorithm. MOTS-CLES : modele CLNF, spline, contour des levres, parametres labiaux, parole visuelle.

[1]  H. McGurk,et al.  Visual influences on speech perception processes , 1978, Perception & psychophysics.

[2]  C. Benoît,et al.  A set of French visemes for visual speech synthesis , 1994 .

[3]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[4]  Simon Lucey,et al.  Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[5]  内村 圭一,et al.  Active Appearance Modelを用いた作成表情画像による顔認証 , 2013 .

[6]  Gung Feng,et al.  Data smoothing by cubic spline filters , 1998, IEEE Trans. Signal Process..

[7]  Zuheng Ming,et al.  Estimation of speech lip features from discrete cosinus transform , 2010, INTERSPEECH.

[8]  L. Bernstein,et al.  Enhanced visual speech perception in individuals with early-onset hearing impairment. , 2007, Journal of speech, language, and hearing research : JSLHR.

[9]  Peter Robinson,et al.  Constrained Local Neural Fields for Robust Facial Landmark Detection in the Wild , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[10]  Alice Caplier,et al.  Lip contour segmentation and tracking compliant with lip-reading application constraints , 2012, Machine Vision and Applications.

[11]  Jean-Luc Schwartz,et al.  The shadow of a doubt? Evidence for perceptuo-motor linkage during auditory and audiovisual close-shadowing , 2014, Front. Psychol..

[12]  Timothy F. Cootes,et al.  Feature Detection and Tracking with Constrained Local Models , 2006, BMVC.

[13]  D. Reisberg,et al.  Easy to hear but hard to understand: A lip-reading advantage with intact auditory stimuli. , 1987 .

[14]  Mohamed Tahar Lallouache,et al.  Un poste "visage-parole" couleur : acquisition et traitement automatique des contours des lèvres , 1991 .