Appearance-based head pose estimation with scene-specific adaptation

We propose an appearance-based head pose estimation method that can be automatically adapted to individual scenes. Appearance-based estimation methods usually require a ground-truth dataset taken from a scene which is similar to test video sequences. However, it is almost impossible to acquire many manually-labeled head images for each scene. To address the problem, we introduce a new approach for aggregating ground truth head pose labels automatically by inferring head pose labels from walking direction. Experimental results demonstrate that our proposed method achieves better accuracy in head pose estimation than the conventional approach using a scene-independent generic dataset.

[1]  Ruigang Yang,et al.  Illumination and Person-Insensitive Head Pose Estimation Using Distance Metric Learning , 2008, ECCV.

[2]  Sharath Pankanti,et al.  Absolute head pose estimation from overhead wide-angle cameras , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[3]  Rainer Stiefelhagen,et al.  Head pose estimation using stereo vision for human-robot interaction , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[4]  Ian Reid,et al.  fastHOG – a real-time GPU implementation of HOG , 2011 .

[5]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[7]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Ian D. Reid,et al.  Colour Invariant Head Pose Classification in Low Resolution Video , 2008, BMVC.

[9]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[10]  Ian D. Reid,et al.  Guiding Visual Surveillance by Tracking Human Attention , 2009, BMVC.

[11]  David H. Douglas,et al.  ALGORITHMS FOR THE REDUCTION OF THE NUMBER OF POINTS REQUIRED TO REPRESENT A DIGITIZED LINE OR ITS CARICATURE , 1973 .

[12]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[13]  Stephen Kwek,et al.  Applying Support Vector Machines to Imbalanced Datasets , 2004, ECML.

[14]  Ian D. Reid,et al.  Estimating Gaze Direction from Low-Resolution Faces in Video , 2006, ECCV.

[15]  Foster J. Provost,et al.  Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..

[16]  Ian Reid,et al.  What are you looking at ? Gaze estimation in medium-scale images , 2005 .

[17]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[18]  Shaogang Gong,et al.  Head Pose Classification in Crowded Scenes , 2009, BMVC.

[19]  Chiraz BenAbdelkader Robust Head Pose Estimation Using Supervised Manifold Learning , 2010, ECCV.

[20]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[21]  Sethuraman Panchanathan,et al.  Biased Manifold Embedding: A Framework for Person-Independent Head Pose Estimation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.