Face pose estimation with ensemble multi-scale representations

Face pose estimation plays important roles in broad applications such as visual based surveillance, face authentication, human-computer intelligent interactions, etc. However, face pose estimation is also a challenge issue, especially under complicated real application environments. In this paper, we proposed a novel face pose estimation approach with integrating two multi-scale representations. The first one is multi-scale VGG-Face representations, which using VGG-Face CNN as backbone three middle scale layer outputs are extracted and go through additional transfer learning. The second one is multi-scale Curvelet representations. These two sub multi-scale representations are integrated and then several dense layers processing are added to form the entire ensemble system which is used for the prediction of face pose. The experiment results show that the proposed approach achieved mean absolute errors (MAE) of 0.33° and 0.23° for yaw and pitch angle on CAS-PEAL pose database, and achieved mean absolute errors of 3.88° and 1.98° for yaw and pitch angle on Pointing'04 database.

[1]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[2]  Montse Pardàs,et al.  Head Orientation Estimation Using Particle Filtering in Multiview Scenarios , 2007, CLEAR.

[3]  Roberto Cipolla,et al.  Determining the gaze of faces in images , 1994, Image Vis. Comput..

[4]  Shaogang Gong,et al.  Multi-view face detection and pose estimation using a composite support vector machine across the view sphere , 1999, Proceedings International Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems. In Conjunction with ICCV'99 (Cat. No.PR00378).

[5]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[6]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Laurent Demanet,et al.  Fast Discrete Curvelet Transforms , 2006, Multiscale Model. Simul..

[9]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[11]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[12]  Jian-Gang Wang,et al.  EM enhancement of 3D head pose estimated by point at infinity , 2007, Image Vis. Comput..

[13]  Qi Feng,et al.  An effective head pose estimation approach using Lie Algebrized Gaussians based face representation , 2013, Multimedia Tools and Applications.

[14]  Yuxiao Hu,et al.  Head pose estimation using Fisher Manifold learning , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[15]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[16]  Bernhard Schölkopf,et al.  Kernel machine based learning for multi-view face detection and pose estimation , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[17]  Qijun Zhao,et al.  Unseen head pose prediction using dense multivariate label distribution , 2016, Frontiers of Information Technology & Electronic Engineering.

[18]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[19]  Xin Geng,et al.  Head Pose Estimation Based on Multivariate Label Distribution , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[21]  J. Crowley,et al.  Estimating Face orientation from Robust Detection of Salient Facial Structures , 2004 .

[22]  Mohan M. Trivedi,et al.  A two-stage head pose estimation framework and evaluation , 2008, Pattern Recognit..

[23]  Wen Gao,et al.  The CAS-PEAL Large-Scale Chinese Face Database and Baseline Evaluations , 2008, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[24]  Shaogang Gong,et al.  Composite support vector machines for detection of faces across views and pose estimation , 2002, Image Vis. Comput..

[25]  Emmanuel J. Candès,et al.  New multiscale transforms, minimum total variation synthesis: applications to edge-preserving image reconstruction , 2002, Signal Process..

[26]  Ou Zongying,et al.  Pose Classification of Human Face Based on Deep Learning and Gradient Information Fusion , 2016 .

[27]  Shaogang Gong,et al.  Real-time face pose estimation , 1998, Real Time Imaging.

[28]  E. Candès,et al.  Curvelets: A Surprisingly Effective Nonadaptive Representation for Objects with Edges , 2000 .