LPHD: A Large-Scale Head Pose Dataset for RGB Images

Head pose estimation has attracted many research interest in recent years. With the advent of deep learning, it is possible to predict the head pose accurately from the RGB images without the help of facial landmarks or depth information. However, existing head pose datasets often lack large pose head images, which extremely limits the development of head pose estimation algorithms. In this paper, we build the largescale head pose dataset (LHPD) including more than 140,000 images with the diverse and accurate head poses. The LHPD dataset includes the head images recorded from different shooting angles between the camera and the human body for the first time, which greatly expands the range of head pose compared to previous datasets. Therefore, the range of head pose can cover +/-90° for each Euler angle. The accurate and reliable head pose annotation is labeled by the motion capture system and careful calibration procedures. We then propose a head pose estimation method through fine-tuning the ResNet on the LHPD dataset when using the Euclidean distance of quaternions as the loss function. The results show that our method achieves better performance than current state-of-theart algorithms.

[1]  Marco La Cascia,et al.  Fast, Reliable Head Tracking under Varying Illumination: An Approach Based on Registration of Texture-Mapped 3D Models , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Harry Wechsler,et al.  Face pose discrimination using support vector machines (SVM) , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[3]  Luc Van Gool,et al.  Random Forests for Real Time 3D Face Analysis , 2012, International Journal of Computer Vision.

[4]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yann LeCun,et al.  Synergistic Face Detection and Pose Estimation with Energy-Based Models , 2004, J. Mach. Learn. Res..

[7]  Rama Chellappa,et al.  HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  B. Roth,et al.  Motion Synthesis Using Kinematic Mappings , 1983 .

[9]  James M. Rehg,et al.  Fine-Grained Head Pose Estimation Without Keypoints , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10]  In-So Kweon,et al.  Real-Time Head Orientation from a Monocular Camera Using Deep Neural Network , 2014, ACCV.

[11]  Qiang Ji,et al.  Facial Landmark Detection: A Literature Survey , 2018, International Journal of Computer Vision.

[12]  Georgios Tzimiropoulos,et al.  How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks) , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Qiang Ji,et al.  Differentiating Between Posed and Spontaneous Expressions with Latent Regression Bayesian Network , 2017, AAAI.

[14]  Subramanian Ramanathan,et al.  On the relationship between head pose, social attention and personality prediction for unstructured and dynamic group interactions , 2013, ICMI '13.

[15]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Louis-Philippe Morency,et al.  Convolutional Experts Constrained Local Model for 3D Facial Landmark Detection , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[17]  Du Q. Huynh,et al.  Metrics for 3D Rotations: Comparison and Analysis , 2009, Journal of Mathematical Imaging and Vision.

[18]  V. Ferrario,et al.  Active range of motion of the head and cervical spine: a three‐dimensional investigation in healthy young adults , 2002, Journal of orthopaedic research : official publication of the Orthopaedic Research Society.

[19]  Mario Fritz,et al.  Appearance-based gaze estimation in the wild , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Rainer Stiefelhagen,et al.  DriveAHead — A Large-Scale Driver Head Pose Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21]  Shaogang Gong,et al.  Face distributions in similarity space under varying head pose , 2001, Image Vis. Comput..

[22]  Sheng Wan,et al.  QuatNet: Quaternion-Based Head Pose Estimation With Multiregression Loss , 2019, IEEE Transactions on Multimedia.

[23]  William T. Freeman,et al.  Example-based head tracking , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[24]  Xiaoming Liu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Xiangyu Zhu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Angelo Cangelosi,et al.  Head pose estimation in the wild using Convolutional Neural Networks and adaptive gradient methods , 2017, Pattern Recognit..

[27]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[28]  Jean-Marc Odobez,et al.  We are not contortionists: Coupled adaptive learning for head and body orientation estimation in surveillance video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.