Face reconstruction from monocular video using uncertainty analysis and a generic model

Reconstructing a 3D model of a human face from a monocular video sequence is an important problem in computer vision, with applications to recognition, surveillance, multimedia, etc. However, the quality of 3D reconstructions using structure from motion (SfM) algorithms is often not satisfactory. One of the reasons is the poor quality of the input video data. Hence, it is important that 3D face reconstruction algorithms take into account the statistics representing the quality of the video. Also, because of the structural similarities of most faces, it is natural that the performance of these algorithms can be improved by using a generic model of a face. Most of the existing work using this approach initializes the reconstruction algorithm with this generic model. The problem with this approach is that the algorithm can converge to a solution very close to this initial value, resulting in a reconstruction which resembles the generic model rather than the particular face in the video which needs to be modeled. In this paper, we propose a method of 3D reconstruction of a human face from video in which the 3D reconstruction algorithm and the generic model are handled separately. We show that it is possible to obtain a reasonably good 3D SfM estimate purely from the video sequence, provided the quality of the input video is statistically assessed and incorporated into the algorithm. The final 3D model is obtained after combining the SfM estimate and the generic model using an energy function that corrects for the errors in the estimate by comparing the local regions in the two models. The main advantage of our algorithm over others is that it is able to retain the specific features of the face in the video sequence even when these features are different from those of the generic model and it does so even as the quality of the input video varies. The evolution of the 3D model through the various stages of the algorithm and an analysis of its accuracy are presented.

[1]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[2]  Olivier Faugeras,et al.  Three D-Dynamic Scene Analysis: A Stereo Based Approach , 1992 .

[3]  Zicheng Liu,et al.  Model-based bundle adjustment with application to face modeling , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[4]  Rama Chellappa,et al.  Towards a criterion for evaluating the quality of 3D reconstructions , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Henning Biermann,et al.  Recovering non-rigid 3D shape from image streams , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[6]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[7]  Alex Pentland,et al.  Parametrized structure from motion for 3D adaptive feedback tracking of faces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Matthew Brand,et al.  Morphable 3D models from video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[9]  Rama Chellappa,et al.  Statistical Analysis of Inherent Ambiguities in Recovering 3-D Motion from a Noisy Flow Field , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Andrew Blake,et al.  Visual Reconstruction , 1987, Deep Learning for EEG-Based Brain–Computer Interfaces.

[11]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[12]  M. Pollefeys Self-calibration and metric 3d reconstruction from uncalibrated image sequences , 1999 .

[13]  John Oliensis,et al.  A Critique of Structure-from-Motion Algorithms , 2000, Comput. Vis. Image Underst..

[14]  Pascal Fua,et al.  Regularized Bundle-Adjustment to Model Heads from Image Sequences without Calibration Data , 2000, International Journal of Computer Vision.

[15]  Rama Chellappa,et al.  Stochastic Approximation and Rate-Distortion Analysis for Robust Structure and Motion Estimation , 2003, International Journal of Computer Vision.

[16]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[17]  A. Murat Tekalp,et al.  Error Characterization of the Factorization Method , 2001, Comput. Vis. Image Underst..

[18]  H. Vincent Poor,et al.  An Introduction to Signal Detection and Estimation , 1994, Springer Texts in Electrical Engineering.

[19]  H. Vincent Poor,et al.  An introduction to signal detection and estimation (2nd ed.) , 1994 .

[20]  Rama Chellappa,et al.  Wide baseline image registration with application to 3-D face modeling , 2004, IEEE Transactions on Multimedia.

[21]  Andrew Zisserman,et al.  Multiple view geometry in computer visiond , 2001 .

[22]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[23]  W. Rudin Principles of mathematical analysis , 1964 .

[24]  James J. Clark,et al.  Data Fusion for Sensory Information Processing Systems , 1990 .

[25]  Simon J. Godsill,et al.  On sequential simulation-based methods for Bayesian filtering , 1998 .

[26]  Olivier Faugeras,et al.  3D Dynamic Scene Analysis , 1992 .

[27]  Simon J. Godsill,et al.  On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[28]  Rama Chellappa,et al.  An information theoretic criterion for evaluating the quality of 3-D reconstructions from video , 2004, IEEE Transactions on Image Processing.

[29]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Sami Romdhani,et al.  Face Identification by Fitting a 3D Morphable Model Using Linear Shape and Texture Error Functions , 2002, ECCV.

[31]  Vishvjit S. Nalwa,et al.  A guided tour of computer vision , 1993 .

[32]  P. Kumar,et al.  Theory and practice of recursive identification , 1985, IEEE Transactions on Automatic Control.

[33]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[34]  Sridhar Srinivasan,et al.  Extracting Structure from Optical Flow Using the Fast Error Search Technique , 2000, International Journal of Computer Vision.