A two-stage head pose estimation framework and evaluation

Head pose is an important indicator of a person's focus of attention. Also, head pose estimation can be used as the front-end analysis for multi-view face analysis. For example, face recognition and identification algorithms are usually view dependent. Pose classification can help such face recognition systems to select the best view model. Subspace analysis has been widely used for head pose estimation. However, such techniques are usually sensitive to data alignment and background noise. In this paper a two-stage approach is proposed to address this issue by combining the subspace analysis together with the topography method. The first stage is based on the subspace analysis of Gabor wavelets responses. Different subspace techniques were compared for better exploring the underlying data structure. Nearest prototype matching with Euclidean distance was used to get the pose estimate. The single pose estimate was relaxed to a subset of poses around it to incorporate certain tolerance to data alignment and background noise. In the second stage, the pose estimate is refined by analyzing finer geometrical structure details captured by bunch graphs. This coarse-to-fine framework was evaluated with a large data set. We examined 86 poses, with the pan angle spanning from -90^@? to 90^@? and the tilt angle spanning from -60^@? to 45^@?. The experimental results indicate that the integrated approach has a remarkably better performance than using subspace analysis alone.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Norbert Krüger,et al.  Face recognition by elastic bunch graph matching , 1997, Proceedings of International Conference on Image Processing.

[3]  Yun Fu,et al.  Graph embedded analysis for head pose estimation , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[4]  Majid Mirmehdi,et al.  The Proceedings of the 11th British Machine Vision Conference , 2000 .

[5]  Gerald Sommer,et al.  Efficient Head Pose Estimation with Gabor Wavelet Networks , 2000, BMVC.

[6]  Shaogang Gong,et al.  Support vector regression and classification based multi-view face detection and recognition , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[7]  Mohan M. Trivedi,et al.  DRIVER HEAD POSE AND VIEW ESTIMATION WITH SINGLE OMNIDIRECTIONAL VIDEO STREAM , 2003 .

[8]  Rainer Stiefelhagen,et al.  Head pose estimation using stereo vision for human-robot interaction , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[9]  Tieniu Tan,et al.  Head pose estimation using Gabor eigenspace modeling , 2002, Proceedings. International Conference on Image Processing.

[10]  Ming-Hsuan Yang,et al.  Kernel Eigenfaces vs. Kernel Fisherfaces: Face recognition using kernel methods , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[11]  David G. Stork,et al.  Pattern Classification , 1973 .

[12]  Yuxiao Hu,et al.  Head pose estimation using Fisher Manifold learning , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[13]  Shaogang Gong,et al.  Recognising trajectories of facial identities using kernel discriminant analysis , 2003, Image Vis. Comput..

[14]  LinLin Shen,et al.  Gabor feature based face recognition using kernel methods , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[15]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[16]  Tony Jebara,et al.  A Kernel Between Sets of Vectors , 2003, ICML.

[17]  Bernhard Schölkopf,et al.  Kernel machine based learning for multi-view face detection and pose estimation , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[18]  N.D. Georganas,et al.  Real-time 2 1/2 D head pose recovery for model-based video-coding , 2000, Proceedings of the 17th IEEE Instrumentation and Measurement Technology Conference [Cat. No. 00CH37066].

[19]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[20]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[21]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[22]  S. Park,et al.  Partial & Holistic Face Recognition on FRGC-II data using Support Vector Machine , 2006, CVPR Workshops.

[23]  Larry S. Davis,et al.  An anthropometric shape model for estimating head orientation , 1997 .

[24]  Bernhard Schölkopf,et al.  A kernel view of the dimensionality reduction of manifolds , 2004, ICML.

[25]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[26]  Rainer Stiefelhagen,et al.  Tracking focus of attention in meetings , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[27]  Shaogang Gong,et al.  Understanding Pose Discrimination in Similarity Space , 1999, BMVC.

[28]  B. MacLennan Gabor Representations of Spatiotemporal Visual Images , 1991 .

[29]  Geoffrey E. Hinton,et al.  Learning Generative Texture Models with extended Fields-of-Experts , 2009, BMVC.

[30]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[31]  Trevor Darrell,et al.  Pose estimation using 3D view-based eigenspaces , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[32]  Paul A. Beardsley,et al.  A qualitative approach to classifying gaze direction , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[33]  Yuntao Qian,et al.  Face recognition using a kernel fractional-step discriminant analysis algorithm , 2007, Pattern Recognit..

[34]  Kim L. Boyer,et al.  Head pose estimation using view based eigenspaces , 2002, Object recognition supported by user interaction for service robots.

[35]  Shu Yang,et al.  Fisher+Kernel criterion for discriminant analysis , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[36]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[37]  Norbert Krüger,et al.  Determination of face position and pose with a learned representation based on labelled graphs , 1997, Image Vis. Comput..