An Unsupervised Algorithm to Extract Face Texture from Video

Building face models is an essential task in face recognition, tracking and etc. However, most of the current techniques require hand-labelling or special machinery such as cyber-scanner to extract the face model. In the paper, we propose an unsupervised algorithm to learn the face texture from video. The proposed approach models the video sequence as a mixture of dynamic face-layers and background layers, where the dynamic face-layers may undergo 3D motions in the video. The hidden variables and their generating process is represented by probabilistic graphical model. The model is learnt by EM algorithm with variational approximation. The proposed approach offers several advantage over existing algorithms. First, it derive its learning power by a generative model which naturally represents the generating process of videos. Second, it does not require any labelling or face detection algorithm. Third, the application domain of the proposed algorithm is not restricted to extracting face texture and it can be adapted to model other objects as well. The experimental results demonstrate that our model is capable of learning the appearance model of faces with complex 3D motions in the video.

[1]  B. Frey,et al.  Transformation-Invariant Clustering Using the EM Algorithm , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Brendan J. Frey,et al.  Learning flexible sprites in video layers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[3]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[4]  Brendan J. Frey,et al.  Transformed hidden Markov models: estimating mixture models of images and inferring spatial transformations in video sequences , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[5]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[6]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[7]  Tomaso A. Poggio,et al.  Linear Object Classes and Image Synthesis From a Single Example Image , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Brendan J. Frey,et al.  Learning appearance and transparency manifolds of occluded objects in layers , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[9]  Thomas Vetter,et al.  Face Recognition Based on Fitting a 3D Morphable Model , 2003, IEEE Trans. Pattern Anal. Mach. Intell..