Robust Facial Feature Detection and Tracking for Head Pose Estimation in a Novel Multimodal Interface for Social Skills Learning

A robust and efficient facial feature detection and tracking approach for head pose estimation is presented in this paper. Six facial feature points (inner eye corners, nostrils and mouth corners) are detected and tracked using multiple cues including facial feature intensity and its probability distribution based on a novel histogram entropy analysis, geometric characteristics and motion information. The head pose is estimated from tracked points and a 3D facial feature model using POSIT and RANSAC algorithms. The proposed method demonstrates its capability in gaze tracking in a new multimodal technology enhanced learning (TEL) environment supporting learning of social communication skills.

[1]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[2]  Alexander H. Waibel,et al.  Simultaneous tracking of head poses in a panoramic view , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[3]  Abu Sayeed Md. Sohail,et al.  Detection of Facial Feature Points Using Anthropometric Face Model , 2008 .

[4]  Alexander H. Waibel,et al.  Real-Time Face and Facial Feature Tracking and Applications , 1998, AVSP.

[5]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[6]  Ashish Kapoor,et al.  Real-time, fully automatic upper facial feature tracking , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[7]  Heiko Neumann,et al.  Detection of Head Pose and Gaze Direction for Human-Computer Interaction , 2006, PIT.

[8]  Ajit Rajwade,et al.  Facial pose from 3D data , 2006, Image Vis. Comput..

[9]  Guoyin Wang,et al.  Optical flow-based facial feature tracking using prior measurement , 2008, 2008 7th IEEE International Conference on Cognitive Informatics.

[10]  James L. Crowley,et al.  Head Pose Estimation on Low Resolution Images , 2006, CLEAR.

[11]  Elisabeth André,et al.  Perception and Interactive Technologies, International Tutorial and Research Workshop, PIT 2006, Kloster Irsee, Germany, June 19-21, 2006, Proceedings , 2006, PIT.

[12]  Larry S. Davis,et al.  Model-based object pose in 25 lines of code , 1992, International Journal of Computer Vision.

[13]  Alexander Zelinsky,et al.  An algorithm for real-time stereo vision implementation of head pose and gaze direction measurement , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[14]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[15]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Timothy F. Cootes,et al.  Feature Detection and Tracking with Constrained Local Models , 2006, BMVC.

[17]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[18]  Jonathan G. Fiscus,et al.  Multimodal Technologies for Perception of Humans, International Evaluation Workshops CLEAR 2007 and RT 2007, Baltimore, MD, USA, May 8-11, 2007, Revised Selected Papers , 2008, CLEAR.

[19]  Yuxiao Hu,et al.  Estimating face pose by facial asymmetry and geometry , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[20]  Qiang Ji,et al.  Real time 3D face pose discrimination based on active IR illumination , 2002, Object recognition supported by user interaction for service robots.

[21]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[22]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.