Robust Model-Based 3D Head Pose Estimation

We introduce a method for accurate three dimensional head pose estimation using a commodity depth camera. We perform pose estimation by registering a morphable face model to the measured depth data, using a combination of particle swarm optimization (PSO) and the iterative closest point (ICP) algorithm, which minimizes a cost function that includes a 3D registration and a 2D overlap term. The pose is estimated on the fly without requiring an explicit initialization or training phase. Our method handles large pose angles and partial occlusions by dynamically adapting to the reliable visible parts of the face. It is robust and generalizes to different depth sensors without modification. On the Biwi Kinect dataset, we achieve best-in-class performance, with average angular errors of 2.1, 2.1 and 2.4 degrees for yaw, pitch, and roll, respectively, and an average translational error of 5.9 mm, while running at 6 fps on a graphics processing unit.

[1]  Luc Van Gool,et al.  Random Forests for Real Time 3D Face Analysis , 2012, International Journal of Computer Vision.

[2]  Zhengyou Zhang,et al.  3D Deformable Face Tracking with a Commodity Depth Camera , 2010, ECCV.

[3]  Antonis A. Argyros,et al.  Head pose estimation on depth data based on Particle Swarm Optimization , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[4]  Ying Cai,et al.  Robust Head Pose Estimation Using a 3D Morphable Model , 2015 .

[5]  Horst Bischof,et al.  3D-MAM: 3D morphable appearance model for efficient fine head pose estimation from still images , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[6]  Peter J. Angeline,et al.  Evolutionary Optimization Versus Particle Swarm Optimization: Philosophy and Performance Differences , 1998, Evolutionary Programming.

[7]  Hao Li,et al.  Realtime performance-based facial animation , 2011, ACM Trans. Graph..

[8]  Rainer Stiefelhagen,et al.  Real Time Head Model Creation and Head Pose Estimation on Consumer Depth Cameras , 2014, 2014 2nd International Conference on 3D Vision.

[9]  Luc Van Gool,et al.  Real time head pose estimation with random regression forests , 2011, CVPR 2011.

[10]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[11]  Antonis A. Argyros,et al.  Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[12]  Kok-Lim Low Linear Least-Squares Optimization for Point-to-Plane ICP Surface Registration , 2004 .

[13]  Luc Van Gool,et al.  Real-time face pose estimation from single range images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Rainer Stiefelhagen,et al.  Head pose estimation using stereo vision for human-robot interaction , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[15]  Peter Robinson,et al.  3D Constrained Local Model for rigid and non-rigid facial tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Kikuo Fujimura,et al.  3D head pose estimation with optical flow and depth constraints , 2003, Fourth International Conference on 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings..

[17]  Kun Zhou,et al.  3D shape regression for real-time facial animation , 2013, ACM Trans. Graph..

[18]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[19]  P. Leva Adjustments to Zatsiorsky-Seluyanov's segment inertia parameters. , 1996 .

[20]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[21]  Li Zhang,et al.  Spacetime faces: high resolution capture for modeling and animation , 2004, SIGGRAPH 2004.

[22]  Lijun Yin,et al.  Automatic pose estimation of 3D facial models , 2008, 2008 19th International Conference on Pattern Recognition.

[23]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[24]  Luc Van Gool,et al.  Real Time Head Pose Estimation from Consumer Depth Cameras , 2011, DAGM-Symposium.

[25]  Marc Levoy,et al.  Efficient variants of the ICP algorithm , 2001, Proceedings Third International Conference on 3-D Digital Imaging and Modeling.

[26]  Jihun Yu,et al.  Realtime facial animation with on-the-fly correctives , 2013, ACM Trans. Graph..

[27]  Tobias Bär,et al.  Driver head pose and gaze estimation based on multi-template ICP 3-D point cloud alignment , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[28]  Sander Oude Elberink,et al.  Accuracy and Resolution of Kinect Depth Data for Indoor Mapping Applications , 2012, Sensors.

[29]  Paul J. Besl,et al.  Method for registration of 3-D shapes , 1992, Other Conferences.

[30]  Maurice Clerc,et al.  The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..

[31]  Michael J. Jones,et al.  Real-time 3D head pose and facial landmark estimation from depth images using triangular surface patch features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Chen Qian,et al.  Realtime and Robust Hand Tracking from Depth , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Ahmed M. Elgammal,et al.  High Resolution Acquisition, Learning and Transfer of Dynamic 3‐D Facial Expressions , 2004, Comput. Graph. Forum.

[35]  Martin D. Levine,et al.  Registering Multiview Range Data to Create 3D Computer Objects , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Nicu Sebe,et al.  Robust Real-Time Extreme Head Pose Estimation , 2014, 2014 22nd International Conference on Pattern Recognition.

[37]  Luc Van Gool,et al.  Fast 3D Scanning with Automatic Motion Compensation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Walid Mahdi,et al.  3D Face Pose Tracking using Low Quality Depth Cameras , 2013, VISAPP.