Fusion of Keypoint Tracking and Facial Landmark Detection for Real-Time Head Pose Estimation

In this paper, we address the problem of extreme head pose estimation from intensity images, in a monocular setup. We introduce a novel fusion pipeline to integrate into a dedicated Kalman Filter the pose estimated from a tracking scheme in the prediction stage and the pose estimated from a detection scheme in the correction stage. To that end, the measurement covariance of the Kalman Filter is updated in every frame. The tracking scheme is performed using a set of keypoints extracted in the area of the head along with a simple 3D geometric model. The detection scheme, on the other hand, relies on the alignment of facial landmarks in each frame combined with 3D features extracted on a head mesh. The head pose in each scheme is estimated by minimizing the reprojection error from the 3D-2D correspondences. By combining both frameworks, we extend the applicability of head pose estimation from facial landmarks to cases where these features are no longer visible. We compared the proposed method to other related approaches, showing that it can achieve state-of-the-art performance. We also demonstrate that our approach is suitable for cases with extreme head rotations and (self-) occlusions, besides being suitable for real time applications.

[1]  Michael J. Jones,et al.  Real-time 3D head pose and facial landmark estimation from depth images using triangular surface patch features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Fernando De la Torre,et al.  Driver Gaze Tracking and Eyes Off the Road Detection System , 2015, IEEE Transactions on Intelligent Transportation Systems.

[5]  Luc Van Gool,et al.  Real time head pose estimation with random regression forests , 2011, CVPR 2011.

[6]  Takeo Kanade,et al.  Dense 3D face alignment from 2D video for real-time use , 2017, Image Vis. Comput..

[7]  B. H. Pawan Prasad,et al.  A robust head pose estimation system for uncalibrated monocular videos , 2010, ICVGIP '10.

[8]  Nassir Navab,et al.  Real-Time Accurate 3D Head Tracking and Pose Estimation with Consumer RGB-D Cameras , 2017, International Journal of Computer Vision.

[9]  Daijin Kim,et al.  Robust head tracking using 3D ellipsoidal head model in particle filter , 2008, Pattern Recognit..

[10]  T. Kanade,et al.  Robust 3 D Head Tracking by View-based Feature Point Registration , 2010 .

[11]  Nicu Sebe,et al.  Robust Real-Time Extreme Head Pose Estimation , 2014, 2014 22nd International Conference on Pattern Recognition.

[12]  Bok-Suk Shin,et al.  Novel Backprojection Method for Monocular Head Pose Estimation , 2013, Int. J. Fuzzy Log. Intell. Syst..

[13]  Dieter Schmalstieg,et al.  Adaptive user perspective rendering for Handheld Augmented Reality , 2017, 2017 IEEE Symposium on 3D User Interfaces (3DUI).

[14]  Claire C. Gordon,et al.  2012 Anthropometric Survey of U.S. Army Personnel: Methods and Summary Statistics , 2014 .

[15]  Ioannis A. Kakadiaris,et al.  Joint Head Pose Estimation and Face Alignment Framework Using Global and Local CNN Features , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[16]  Peter Robinson,et al.  3D Constrained Local Model for rigid and non-rigid facial tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Kostas Karpouzis,et al.  Head pose estimation with one camera, in uncalibrated environments , 2010, EGIHMI '10.

[18]  Qiang Ji,et al.  Simultaneous Facial Landmark Detection, Pose and Deformation Estimation Under Facial Occlusion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Rita Cucchiara,et al.  POSEidon: Face-from-Depth for Driver Pose Estimation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Wei Liang,et al.  3D head pose estimation with convolutional neural network trained on synthetic images , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[21]  Myung Jin Chung,et al.  3D head tracking and pose-robust 2D texture map-based face recognition using a simple ellipsoid model , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Takeo Kanade,et al.  Pose Robust Face Tracking by Combining Active Appearance Models and Cylinder Head Models , 2007, International Journal of Computer Vision.

[23]  Neil A. Dodgson,et al.  Variation and extrema of human interpupillary distance , 2004, IS&T/SPIE Electronic Imaging.

[24]  Didier Stricker,et al.  Real-time monocular 6-DOF head pose estimation from salient 2D points , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[25]  Simon Lucey,et al.  Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[26]  Ayoub Al-Hamadi,et al.  Boosted human head pose estimation using kinect camera , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[27]  Jan Kautz,et al.  Robust Model-Based 3D Head Pose Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Rainer Stiefelhagen,et al.  DriveAHead — A Large-Scale Driver Head Pose Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[29]  Tom Drummond,et al.  Faster and Better: A Machine Learning Approach to Corner Detection , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Jean-Marc Odobez,et al.  Robust and Accurate 3D Head Pose Estimation through 3DMM and Online Head Model Reconstruction , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[31]  Federico Sukno,et al.  Head Pose Estimation Based on 3-D Facial Landmarks Localization and Regression , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[32]  Gérard Chollet,et al.  3D Face Pose and Animation Tracking via Eigen-Decomposition Based Bayesian Approach , 2013, ISVC.

[33]  Chao Yin,et al.  Real-time head pose estimation for driver assistance system using low-cost on-board computer , 2016, VRCAI.

[34]  Jean-Marc Odobez,et al.  Structure and appearance features for robust 3D facial actions tracking , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[35]  Takeo Kanade,et al.  Robust 3D Head Tracking by Online Feature Registration , 2008 .

[36]  Javier R. Movellan,et al.  Generalized adaptive view-based appearance model: Integrated framework for monocular head pose estimation , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[37]  Marco La Cascia,et al.  Fast, Reliable Head Tracking under Varying Illumination: An Approach Based on Registration of Texture-Mapped 3D Models , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Denis Laurendeau,et al.  Highly Accurate and Fully Automatic Head Pose Estimation from a Low Quality Consumer-Level RGB-D Sensor , 2015, HCMC '15.

[39]  J.-Y. Bouguet,et al.  Pyramidal implementation of the lucas kanade feature tracker , 1999 .

[40]  In-So Kweon,et al.  Real-Time Head Orientation from a Monocular Camera Using Deep Neural Network , 2014, ACCV.

[41]  Yoichi Sato,et al.  Pose-Invariant Facial Expression Recognition Using Variable-Intensity Templates , 2007, ACCV.

[42]  Vincent Lepetit,et al.  3-D Head Tracking via Invariant Keypoint Learning , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[43]  Theo Gevers,et al.  Robustifying eye center localization by head pose cues , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Nicu Sebe,et al.  Combining Head Pose and Eye Location Information for Gaze Estimation , 2012, IEEE Transactions on Image Processing.

[45]  Luc Van Gool,et al.  Random Forests for Real Time 3D Face Analysis , 2012, International Journal of Computer Vision.

[46]  Qiang Ji,et al.  Coupled cascade regression for simultaneous facial landmark detection and head pose estimation , 2017, 2017 IEEE International Conference on Image Processing (ICIP).