Head Pose Estimation for an Omnidirectional Camera Using a Convolutional Neural Network

The human head pose provides insights on the activities or intentions of a given person. Head pose estimation techniques are thus often employed in intelligent surveillance camera systems for marketing analysis or security monitoring. Nowadays, omnidirectional cameras have become widely used in surveillance systems owing to their unique property of wide-range coverage. However, this property causes significant changes in visual appearance and distortions inside the image, and general approaches using a head image may fail in estimation. In this paper, we thus propose a method for head pose estimation using omnidirectional camera images. The proposed model employs both a head image and full body image for cases in which a face is self-occluded and the head image is thus almost useless. In addition, image attribute data are integrated into the network to learn the relation between the changes in appearance or distortion and locations inside the whole image. Experiments are conducted to compare the accuracy of the presented approach with those of ordinary methods. It is verified that the proposed method improves the accuracy by more than 19% over the baseline method.

[1]  Mohan M. Trivedi,et al.  A Survey of Vision-Based Trajectory Learning and Analysis for Surveillance , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Xiaogang Wang,et al.  Intelligent multi-camera video surveillance: A review , 2013, Pattern Recognit. Lett..

[3]  Ramakant Nevatia,et al.  Event Detection and Analysis from Video Streams , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  In-So Kweon,et al.  Real-Time Head Orientation from a Monocular Camera Using Deep Neural Network , 2014, ACCV.

[5]  Lucas Beyer,et al.  Biternion Nets: Continuous Head Pose Regression from Discrete Training Labels , 2015, GCPR.

[6]  Neil Martin Robertson,et al.  Deep Head Pose: Gaze-Direction Estimation in Multimodal Video , 2015, IEEE Transactions on Multimedia.

[7]  Shaogang Gong,et al.  Head Pose Classification in Crowded Scenes , 2009, BMVC.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Yong-Guk Kim,et al.  Deep head pose estimation for faces in the wile and its transfer learning , 2017, 2017 Seventh International Conference on Information Science and Technology (ICIST).

[10]  Ming-Liang Wang,et al.  An Intelligent Surveillance System Based on an Omnidirectional Vision Sensor , 2006, 2006 IEEE Conference on Cybernetics and Intelligent Systems.

[11]  Sergio A. Velastin,et al.  Intelligent distributed surveillance systems: a review , 2005 .

[12]  Dariusz Frejlichowski,et al.  Intelligent video surveillance systems for public spaces – a survey , 2014 .

[13]  Angelo Cangelosi,et al.  Head pose estimation in the wild using Convolutional Neural Networks and adaptive gradient methods , 2017, Pattern Recognit..

[14]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[15]  Teera Siriteerakul Advance in Head Pose Estimation from Low Resolution Images: A Review , 2012 .

[16]  James M. Rehg,et al.  Fine-Grained Head Pose Estimation Without Keypoints , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  Ioannis A. Kakadiaris,et al.  Joint Head Pose Estimation and Face Alignment Framework Using Global and Local CNN Features , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[18]  Kang Zheng,et al.  Combining local appearance and holistic view: Dual-Source Deep Neural Networks for human pose estimation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Wei Liang,et al.  3D head pose estimation with convolutional neural network trained on synthetic images , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[20]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[21]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Peng Wang,et al.  Appearance based pedestrians' head pose and body orientation estimation using deep learning , 2018, Neurocomputing.