Head pose estimation using improved label distribution learning with fewer annotations

Head pose estimation in unconstrained environment remains a challenging task due to background clutter, illumination changes, and appearance variabilities. Multivariate label distribution has been successfully applied to head pose estimation. However, it is not applicable to unconstrained environments where assigning reasonable label distributions for images is difficult, and its performance significantly degrades when accurate grid information is unavailable (e.g., only yaw angles are known). To alleviate these problems, we propose an improved label distribution learning approach with fewer annotations. A data-driven weak learning strategy is first developed to construct label distributions to alleviate the problem of unreasonable label distributions. Regularization terms (e.g., L1,2 norm) are then introduced into the loss function induced by weighted Jeffreys divergence to avoid over-fitting. To further ameliorate the performance, positive correlation and negative competition are also introduced into the loss function to fine-tune the parameters of the corresponding model. Extensive experiments have been conducted on public databases: LFW and Pointing04. The proposed method achieves comparable performance over the state-of-art and possesses good generalization ability, but uses only fewer annotations, which suggests that it has strong potential for head pose estimation in unconstrained environments where sufficient annotations are routinely unavailable.

[1]  Luc Van Gool,et al.  Real time head pose estimation with random regression forests , 2011, CVPR 2011.

[2]  Rama Chellappa,et al.  Growing Regression Tree Forests by Classification for Continuous Object Pose Estimation , 2017, International Journal of Computer Vision.

[3]  Mao Ye,et al.  Head Pose Estimation Based on Robust Convolutional Neural Network , 2016 .

[4]  Lei Xu,et al.  Computer Vision for Head Pose Estimation: Review of a Competition , 2015, SCIA.

[5]  Mohan M. Trivedi,et al.  Head Pose Estimation for Driver Assistance Systems: A Robust Algorithm and Experimental Evaluation , 2007, 2007 IEEE Intelligent Transportation Systems Conference.

[6]  Narendra Ahuja,et al.  Robust visual tracking via multi-task sparse learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Honggang Zhang,et al.  Joint patch and multi-label learning for facial action unit detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Timothy F. Cootes,et al.  Multi-view Constrained Local Models for Large Head Angle Facial Tracking , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[10]  Yun Fu,et al.  Graph embedded analysis for head pose estimation , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Stan Z. Li,et al.  Exclusivity-Consistency Regularized Multi-view Subspace Clustering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Xin Geng,et al.  Head Pose Estimation Based on Multivariate Label Distribution , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Wenming Zheng,et al.  Multi-View Facial Expression Recognition Based on Group Sparse Reduced-Rank Regression , 2014, IEEE Transactions on Affective Computing.

[15]  J. Crowley,et al.  Estimating Face orientation from Robust Detection of Salient Facial Structures , 2004 .

[16]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[17]  Damon L. Woodard,et al.  Head pose estimation in the wild using approximate view manifolds , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[18]  Rama Chellappa,et al.  HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Bingpeng Ma,et al.  VoD: A novel image representation for head yaw estimation , 2015, Neurocomputing.

[20]  Rama Chellappa,et al.  Growing Regression Forests by Classification: Applications to Object Pose Estimation , 2013, ECCV.

[21]  Qingshan Liu,et al.  Learning active facial patches for expression analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Donghoon Lee,et al.  Fast and Accurate Head Pose Estimation via Random Projection Forests , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Angelo Cangelosi,et al.  Head pose estimation in the wild using Convolutional Neural Networks and adaptive gradient methods , 2017, Pattern Recognit..

[24]  Janusz Konrad,et al.  Estimating head pose orientation using extremely low resolution images , 2016, 2016 IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI).

[25]  Xi Chen,et al.  Accelerated Gradient Method for Multi-task Sparse Learning Problem , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[26]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[27]  Nicu Sebe,et al.  Facial expression recognition under a wide range of head poses , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[28]  Qi Feng,et al.  An effective head pose estimation approach using Lie Algebrized Gaussians based face representation , 2013, Multimedia Tools and Applications.

[29]  Luc Van Gool,et al.  Real-time facial feature detection using conditional regression forests , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Xin Geng,et al.  Label Distribution Learning , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.

[31]  Maja Pantic,et al.  View-Constrained Latent Variable Model for Multi-view Facial Expression Classification , 2014, ISVC.

[32]  Qijun Zhao,et al.  Unseen head pose prediction using dense multivariate label distribution , 2016, Frontiers of Information Technology & Electronic Engineering.

[33]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..