The More, the Merrier? A Study on In-Car IR-based Head Pose Estimation

Deep learning methods have proven useful for head pose estimation, but the effect of their depth, type and input resolution based on infrared (IR) images still need to be explored. In this paper, we present a study on in-car head pose estimation on the IR images of the AutoPOSE dataset, where we extract 64 x 64 and 128 x 128 pixel cropped head images. We propose the novel networks Head Orientation Network (HON) and ResNetHG and compare them with state-of-the-art methods like the HPN model from DriveAHead on different input resolutions. In addition, we evaluate multiple depths within our HON and ResNetHG networks and their effect on the accuracy. Our experiments show that higher resolution images lead to lower estimation errors. Furthermore, we show that deep learning methods with fewer layers perform better on head orientation regression based on IR images. Our HON and ResNetHG18 architectures outperform the state-of-the-art on IR images on four different metrics, where we achieve a reduction of the residual error of up to 74%.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Dariu M. Gavrila,et al.  DD-Pose - A large-scale Driver Head Pose Benchmark , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[3]  Rita Cucchiara,et al.  POSEidon: Face-from-Depth for Driver Pose Estimation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Didier Stricker,et al.  Fusion of Keypoint Tracking and Facial Landmark Detection for Real-Time Head Pose Estimation , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[5]  Alain Pagani,et al.  AutoPOSE: Large-scale Automotive Driver Head Pose and Gaze Dataset with Deep Head Orientation Baseline , 2020, VISIGRAPP.

[6]  Simone Calderara,et al.  Face-from-Depth for Head Pose Estimation on Depth Images , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  In-So Kweon,et al.  Real-time head pose estimation using multi-task deep neural network , 2018, Robotics Auton. Syst..

[8]  Denis Laurendeau,et al.  Highly Accurate and Fully Automatic Head Pose Estimation from a Low Quality Consumer-Level RGB-D Sensor , 2015, HCMC '15.

[9]  Luc Van Gool,et al.  Real time head pose estimation with random regression forests , 2011, CVPR 2011.

[10]  Rama Chellappa,et al.  HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[13]  Wei Liang,et al.  Face pose estimation with combined 2D and 3D HOG features , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[14]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Jan Kautz,et al.  Robust Model-Based 3D Head Pose Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Rainer Stiefelhagen,et al.  DriveAHead — A Large-Scale Driver Head Pose Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  In-So Kweon,et al.  Real-Time Head Orientation from a Monocular Camera Using Deep Neural Network , 2014, ACCV.

[18]  Louis-Philippe Morency,et al.  OpenFace 2.0: Facial Behavior Analysis Toolkit , 2018, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[19]  Michael J. Jones,et al.  Real-time 3D head pose and facial landmark estimation from depth images using triangular surface patch features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).