Face localization via hierarchical CONDENSATION with Fisher boosting feature selection

We formulate face localization as a maximum a posteriori probability (MAP) problem of finding the best estimation of human face configuration in a given image. The a prior distribution for intrinsic face configuration is defined by active shape model (ASM). The likelihood model for local facial features is parameterized as mixture of Gaussians in feature space. A hierarchical CONDENSATION framework is then proposed to estimate the face configuration parameter. In order to improve the discriminative power of likelihood distribution in feature space, a new feature subspace, Fisher boosting feature space, is proposed and compared against PCA subspace and biased PCA subspace. Experiments show that, Fisher boosting algorithm can generate strong classifier with less number of weaker classifiers comparing to conventional Adaboosting algorithm as illustrated in a toy problem, that the face localization with Fisher boosting feature subspace outperforms that with PCA feature subspaces in localization accuracy and convergence rate, and that the design of hierarchical CONDENSATION framework alleviates the local minima problem which is frequently encountered by previous ASM optimization algorithms.

[1]  Thomas S. Huang,et al.  Explanation-based facial motion tracking using a piecewise Bezier volume deformation model , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[2]  T. K. Leungfj,et al.  Finding Faces in Cluttered Scenes using Random Labeled Graph Matching , 1995 .

[3]  Harry Shum,et al.  Kullback-Leibler boosting , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[4]  Timothy F. Cootes,et al.  Constrained active appearance models , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[5]  Markus Kampmann Automatic 3-D face model adaptation for model-based coding of videophone sequences , 2002, IEEE Trans. Circuits Syst. Video Technol..

[6]  Shaogang Gong,et al.  A Multi-View Nonlinear Active Shape Model Using Kernel PCA , 1999, BMVC.

[7]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[8]  Alan L. Yuille,et al.  Feature extraction from faces using deformable templates , 2004, International Journal of Computer Vision.

[9]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[10]  Timothy F. Cootes,et al.  Face Recognition Using Active Appearance Models , 1998, ECCV.

[11]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Timothy F. Cootes,et al.  Active shape models , 1998 .

[13]  Narendra Ahuja,et al.  Facial expression decomposition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  Harry Shum,et al.  Hierarchical Shape Modeling for Automatic Face Localization , 2002, ECCV.