The Best of Both Worlds: Combining 3D Deformable Models with Active Shape Models

Reliable 3D tracking is still a difficult task. Most parametrized 3D deformable models rely on the accurate extraction of image features for updating their parameters, and are prone to failures when the underlying feature distribution assumptions are invalid. Active Shape Models (ASMs), on the other hand, are based on learning, and thus require fewer reliable local image features than parametrized 3D models, but fail easily when they encounter a situation for which they were not trained. In this paper, we develop an integrated framework that combines the strengths of both 3D deformable models and ASMs. The 3D model governs the overall shape, orientation and location, and provides the basis for statistical inference on both the image features and the parameters. The ASMs, in contrast, provide the majority of reliable 2D image features over time, and aid in recovering from drift and total occlusions. The framework dynamically selects among different ASMs to compensate for large viewpoint changes due to head rotations. This integration allows the robust tracking effaces and the estimation of both their rigid and non- rigid motions. We demonstrate the strength of the framework in experiments that include automated 3D model fitting and facial expression tracking for a variety of applications, including sign language.

[1]  Siome Goldenstein,et al.  Statistical Cue Integration in DAG Deformable Models , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Tomaso A. Poggio,et al.  Multidimensional morphable models , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[3]  Timothy F. Cootes,et al.  Automatic Interpretation and Coding of Face Images Using Flexible Models , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[5]  David Salesin,et al.  Modeling and Animating Realistic Faces from Images , 2002, International Journal of Computer Vision.

[6]  Michael Isard,et al.  Active Contours: The Application of Techniques from Graphics, Vision, Control Theory and Statistics to Visual Tracking of Shapes in Motion , 2000 .

[7]  Takeo Kanade,et al.  Real-time combined 2D+3D active appearance models , 2004, CVPR 2004.

[8]  Siome Goldenstein,et al.  3D facial tracking from corrupted movie sequences , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[9]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[10]  Thomas S. Huang,et al.  Visual Estimation and Compression of Facial Motion Parameters—Elements of a 3D Model-Based Video Coding System , 2004, International Journal of Computer Vision.

[11]  Edward Courtney,et al.  2 = 4 M , 1993 .

[12]  Hiroshi Yasaka,et al.  1.3-V/sub pp/ push-pull drive InP Mach-Zehnder modulator module for 40 Gbit/s operation , 2005 .

[13]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[14]  Tomaso Poggio,et al.  Image Representations for Visual Learning , 1996, Science.

[15]  C. Goodall Procrustes methods in the statistical analysis of shape , 1991 .

[16]  Dimitris N. Metaxas,et al.  Optical Flow Constraints on Deformable Models with Applications to Face Tracking , 2000, International Journal of Computer Vision.

[17]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[18]  Jorge Stolfi,et al.  Outlier rejection in high-dimensional deformable models , 2007, Image Vis. Comput..

[19]  Matthew Brand,et al.  Morphable 3D models from video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[20]  Tieniu Tan,et al.  Face Tracking Using Motion-Guided Dynamic Template Matching , 2002 .

[21]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[22]  Siome Goldenstein,et al.  When Occlusions are Outliers , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[23]  Pascal Fua,et al.  Accurate face models from uncalibrated and ill-lit video sequences , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..