Sparsity-based joint gaze correction and face beautification for conferencing video

A well-known problem in video conferencing is gaze mismatch. Instead of relying exclusively on online captured data for rendering, a recent work first trains offline dictionaries using a large image database of movie and TV stars to learn "beautiful" features. During real-time conferencing, one can then simultaneously correct gaze and beautify the subject's facial components in single images by seeking sparse linear combination of pre-trained dictionary atoms for face reconstruction. Extending on this work, we focus on joint gaze correction / face beautification for video. First, we define a large search space invariant to scale, shift and rotation for facial feature beautification based on SIFT. We then address two practical issues unique to video: i) how beautified results can be temporally consistent across group of pictures (GOP), and ii) how blinking eyes can be beautified even though the training database contains only open-eye facial images. Experimental results show that our method achieves the desired temporal consistency, and the blinking process is smooth and natural.

[1]  Kentaro Toyama,et al.  Gaze Awareness for Video-Conferencing: A Software Approach , 2000, IEEE Multim..

[2]  Xian-Sheng Hua,et al.  Object Retrieval Using Visual Query Context , 2011, IEEE Transactions on Multimedia.

[3]  Xianming Liu,et al.  Joint gaze-correction and beautification of DIBR-synthesized human face via dual sparse coding , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[4]  Dong Tian,et al.  View synthesis techniques for 3D video , 2009, Optical Engineering + Applications.

[5]  Ruigang Yang,et al.  Eye gaze correction with stereovision for video-teleconferencing , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  I. Daubechies,et al.  Iteratively reweighted least squares minimization for sparse recovery , 2008, 0807.0575.

[7]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[8]  Andrew Blake,et al.  Gaze manipulation for one-to-one teleconferencing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.