3-D Head Tracking via Invariant Keypoint Learning

Keypoint matching is a standard tool to solve the correspondence problem in vision applications. However, in 3-D face tracking, this approach is often deficient because the human face complexities, together with its rich viewpoint, nonrigid expression, and lighting variations in typical applications, can cause many variations impossible to handle by existing keypoint detectors and descriptors. In this paper, we propose a new approach to tailor keypoint matching to track the 3-D pose of the user head in a video stream. The core idea is to learn keypoints that are explicitly invariant to these challenging transformations. First, we select keypoints that are stable under randomly drawn small viewpoints, nonrigid deformations, and illumination changes. Then, we treat keypoint descriptor learning at different large angles as an incremental scheme to learn discriminative descriptors. At matching time, to reduce the ratio of outlier correspondences, we use second-order color information to prune keypoints unlikely to lie on the face. Moreover, we integrate optical flow correspondences in an adaptive way to remove motion jitter efficiently. Extensive experiments show that the proposed approach can lead to fast, robust, and accurate 3-D head tracking results even under very challenging scenarios.

[1]  Qiang Wang,et al.  Real Time Feature Based 3-D Deformable Face Tracking , 2008, ECCV.

[2]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Jiri Matas,et al.  Matching with PROSAC - progressive sample consensus , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Mohan M. Trivedi,et al.  Robust real-time detection, tracking, and pose estimation of faces in video streams , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[5]  Javier R. Movellan,et al.  Monocular head pose estimation using generalized adaptive view-based appearance model , 2010, Image Vis. Comput..

[6]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[7]  Trevor Darrell,et al.  Pose estimation using 3D view-based eigenspaces , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[8]  Vincent Lepetit,et al.  Accurate Non-Iterative O(n) Solution to the PnP Problem , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9]  Jing Xiao,et al.  Robust full‐motion recovery of head by dynamic templates and re‐registration techniques , 2003 .

[10]  Shaogang Gong,et al.  Fusion of perceptual cues for robust tracking of head pose and position , 2001, Pattern Recognit..

[11]  P. Peer,et al.  Human skin color clustering for face detection , 2003, The IEEE Region 8 EUROCON 2003. Computer as a Tool..

[12]  Gérard G. Medioni,et al.  3D face tracking and expression inference from a 2D sequence using manifold learning , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Larry S. Davis,et al.  Computing 3-D head orientation from a monocular image sequence , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[14]  Ehud Rivlin,et al.  Robust 3D Head Tracking Using Camera Pose Estimation , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[15]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[16]  Jacob Ström Model-Based Real-Time Head Tracking , 2002, EURASIP J. Adv. Signal Process..

[17]  Mohan M. Trivedi,et al.  A two-stage head pose estimation framework and evaluation , 2008, Pattern Recognit..

[18]  Vincent Lepetit,et al.  Fast Keypoint Recognition Using Random Ferns , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Dimitris N. Metaxas,et al.  Optical Flow Constraints on Deformable Models with Applications to Face Tracking , 2000, International Journal of Computer Vision.

[20]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[21]  M.M. Trivedi,et al.  HyHOPE: Hybrid Head Orientation and Position Estimation for vision-based driver head tracking , 2008, 2008 IEEE Intelligent Vehicles Symposium.

[22]  Jean-Michel Morel,et al.  A fully affine invariant image comparison method , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[23]  Vincent Lepetit,et al.  Point matching as a classification problem for fast and robust object pose estimation , 2004, CVPR 2004.

[24]  Harry Shum,et al.  Real-Time Bayesian 3-D Pose Tracking , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[25]  Ying Wu,et al.  Wide-range, person- and illumination-insensitive head orientation estimation , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[26]  Katsuhiko Sakaue,et al.  Head pose estimation by nonlinear manifold learning , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[27]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[28]  William T. Freeman,et al.  Example-based head tracking , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[29]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[30]  Takeo Kanade,et al.  Real-time combined 2D+3D active appearance models , 2004, CVPR 2004.

[31]  Vincent Lepetit,et al.  Stable real-time 3D tracking using online and offline information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Takeo Kanade,et al.  Pose Robust Face Tracking by Combining Active Appearance Models and Cylinder Head Models , 2007, International Journal of Computer Vision.

[33]  Fadi Dornaika,et al.  Simultaneous Facial Action Tracking and Expression Recognition in the Presence of Head Motion , 2008, International Journal of Computer Vision.

[34]  Rainer Stiefelhagen,et al.  Neural Network-based Head Pose Estimation and Multiview Fusion – Draft Version – , 2006 .

[35]  Yuxiao Hu,et al.  Estimating face pose by facial asymmetry and geometry , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[36]  Qiang Ji,et al.  3D Face pose estimation and tracking from a monocular camera , 2002, Image Vis. Comput..

[37]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[38]  Ye Zhang,et al.  3D head tracking under partial occlusion , 2002, Pattern Recognit..

[39]  David G. Lowe,et al.  Shape indexing using approximate nearest-neighbour search in high-dimensional spaces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Marius Malciu,et al.  A robust model-based approach for 3D head tracking in video sequences , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[41]  Serge J. Belongie,et al.  Re-thinking non-rigid structure from motion , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Helge J. Ritter,et al.  Recognition of human head orientation based on artificial neural networks , 1998, IEEE Trans. Neural Networks.

[43]  Timothy F. Cootes,et al.  View-based active appearance models , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[44]  Sanjoy Dasgupta,et al.  Learning the structure of manifolds using random projections , 2007, NIPS.

[45]  Ioannis Pitas,et al.  Facial feature extraction and pose determination , 2000, Pattern Recognit..

[46]  Vladimir Vezhnevets,et al.  A Survey on Pixel-Based Skin Color Detection Techniques , 2003 .

[47]  Takeo Kanade,et al.  Robust 3D Head Tracking by Online Feature Registration , 2008 .

[48]  Matthew Brand,et al.  Flexible flow for 3D nonrigid tracking and shape recovery , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[49]  Wen Gao,et al.  Academy of Sciences, , 2022 .

[50]  Jian-Gang Wang,et al.  Pose determination of human faces by using vanishing points , 2001, Pattern Recognit..

[51]  Marco La Cascia,et al.  Fast, reliable head tracking under varying illumination , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[52]  Rainer Stiefelhagen,et al.  Neural Network-Based Head Pose Estimation and Multi-view Fusion , 2006, CLEAR.

[53]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[54]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[55]  Yuxiao Hu,et al.  Head pose estimation using Fisher Manifold learning , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[56]  Fatih Murat Porikli,et al.  Region Covariance: A Fast Descriptor for Detection and Classification , 2006, ECCV.

[57]  Minsu Cho,et al.  Reweighted Random Walks for Graph Matching , 2010, ECCV.

[58]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.