Boosted human head pose estimation using kinect camera

Head pose estimation is essential for several computer vision applications. For example, it has been employed in facial expression recognition, head gesture detection, and driver monitoring systems. In this work, we present a boosted method to estimate the head pose using Kinect camera. This estimation is cooperatively performed with the help of RGB and depth images. The human face is located in the RGB image using frontal and profile Viola-Jones (VJ) face detector, where the depth information is used to confine the size and location of the search window. Appearance features, extracted from the detected face patch in the RGB image and its corresponding in the depth image, are passed to Support Vector Machine (SVM) regressors to infer the head pose. Evaluation on two public benchmark databases demonstrates that our proposed approach compares favorably to state-of-the-art approaches.

[1]  Andrea Fossati,et al.  Consumer Depth Cameras for Computer Vision , 2013, Advances in Computer Vision and Pattern Recognition.

[2]  Amit A. Kale,et al.  Towards a robust, real-time face processing system using CUDA-enabled GPUs , 2009, 2009 International Conference on High Performance Computing (HiPC).

[3]  Peter Robinson,et al.  3D Constrained Local Model for rigid and non-rigid facial tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[5]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[6]  Jean-Marc Odobez,et al.  Recognizing Visual Focus of Attention From Head Pose in Natural Meetings , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7]  Ayoub Al-Hamadi,et al.  Accurate, Fast and Robust Realtime Face Pose Estimation Using Kinect Camera , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[8]  Wei Liang,et al.  Face pose estimation with combined 2D and 3D HOG features , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[9]  Louis-Philippe Morency,et al.  The effect of head-nod recognition in human-robot conversation , 2006, HRI '06.

[10]  Aly A. Farag,et al.  Facial expression recognition based on geometric and optical flow features in colour image sequences , 2012 .

[11]  Michael Heuer,et al.  Multi-modal Fusion Framework with Particle Filter for Speaker Tracking , 2012 .

[12]  Shigeo Abe,et al.  Support Vector Machines for Pattern Classification (Advances in Pattern Recognition) , 2005 .

[13]  Trevor Darrell,et al.  Head gesture recognition in intelligent interfaces: the role of context in improving recognition , 2006, IUI '06.

[14]  Luc Van Gool,et al.  Real Time Head Pose Estimation from Consumer Depth Cameras , 2011, DAGM-Symposium.

[15]  Ayoub Al-Hamadi,et al.  Frame-Based Facial Expression Recognition Using Geometrical Features , 2014, Adv. Hum. Comput. Interact..

[16]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  Richard Bowden,et al.  Local binary patterns for multi-view facial expression recognition , 2011 .

[18]  Mohan M. Trivedi,et al.  Head Pose Estimation for Driver Assistance Systems: A Robust Algorithm and Experimental Evaluation , 2007, 2007 IEEE Intelligent Transportation Systems Conference.