论文信息 - Mixture of Deep Regression Networks for Head Pose Estimation

Mixture of Deep Regression Networks for Head Pose Estimation

Accurate and robust head pose estimation is a challenging computer vision task. In most existing methods, single-modal RGB or depth images are directly used for head pose estimation. The obvious drawbacks of these methods are two fold: (1) Traditional shallow models are not good at learning representative features. (2) They are single-modal approaches, resulting in sensitivity to noise. As such, in this work we propose a novel multi-modal regression model for head pose estimation, named mixture of deep regression networks (MoDRN). It only uses good examples for one modality to learn sub-network parameters. Thus, the sub-networks tend to be better trained and more robust to noise, making significant improved performance in their combination. Experiments on public datasets such as BIWI and BU-3DFE show the effectiveness of our approach.

[1] Luc Van Gool,et al. Real time head pose estimation with random regression forests , 2011, CVPR 2011.

[2] D. Basak,et al. Support Vector Regression , 2008 .

[3] Donghoon Lee,et al. Fast and Accurate Head Pose Estimation via Random Projection Forests , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4] Doina Precup,et al. Multi-layer temporal graphical model for head pose estimation in real-world videos , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[5] David Beymer,et al. Face recognition under varying pose , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[6] Larry S. Davis,et al. On partial least squares in head pose estimation: How to simultaneously deal with misalignment , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[7] Fernando De la Torre,et al. Robust Regression , 2016, IEEE Trans. Pattern Anal. Mach. Intell..

[8] Alexander J. Smola,et al. Support Vector Regression Machines , 1996, NIPS.

[9] Andy Liaw,et al. Classification and Regression by randomForest , 2007 .

[10] Luc Van Gool,et al. Random Forests for Real Time 3D Face Analysis , 2012, International Journal of Computer Vision.

[11] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[12] Andrew Blake,et al. Sparse and Semi-supervised Visual Mapping with the S^3GP , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13] Shaogang Gong,et al. Face distributions in similarity space under varying head pose , 2001, Image Vis. Comput..

[14] Horst Bischof,et al. Hough Networks for Head Pose Estimation and Facial Feature Localization , 2014, BMVC.